Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Jobs restarting
On Wed, Nov 18, 2015 at 6:28 AM, Peter Ellevseth
<Peter.Ellevseth@xxxxxxxxxx> wrote:
Peter,
> 1. How can I change the time it takes before the head node orders a
> restart of a job.
>
I know I've answered this question before, but I can't find the answer
(or the source of the answer) right now. Sorry.
> 2. Is it possible to change what is done when a restart is issued. Could
> I, instead of condor sending a SIGKILL to the job, tell it to run a script
> that shuts the job down safely? It would be preferable to have condor shut
> the job quietly down instead of restarting it.
>
For Linux, you can use the kill_sig command in the submit file to tell
HTCondor what signal to use. Your code (or a wrapper around it) would
need to trap whatever signal you set and do the appropriate action. If
it's a vanilla universe job, you can also use something like DMTCP to
do checkpointing.
Thanks,
BC
--
Ben Cotton
Cycle Computing
Better Answers. Faster.
http://www.cyclecomputing.com
twitter: @cyclecomputing