[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] signals when a job is killed



On Thu, Jan 15, 2015 at 7:10 AM, Krieger, Donald N. <kriegerd@xxxxxxxx> wrote:

> Is  there a signal sent?
> If so, is it sent to the mother process of my job and which signal is used?
>
By default, SIGTERM is used (except in the standard universe, where
SIGSTP is the default).

> Is there a way to control which signal is used â it would be simplest to catch SIGINT, signal 2.
>
Yes, the kill_sig command in your submit file can specify the
(integer) signal used when a job is getting the boot.
See: http://research.cs.wisc.edu/htcondor/manual/v8.2/condor_submit.html

> And finally, is the function which returns results of the jobs disabled when a job is killed?
>
The answer to this is fuzzier. By default, jobs will get some time to
clean up after themselves (I believe this is 30 seconds), but that's
configurable by site administrators, so it may be longer or shorter
than the default. You'll want to specify the following in your submit
file, though:
when_to_transfer_output = ON_EXIT_OR_EVICT

>From the manual:
The ON_EXIT_OR_EVICT option is intended for fault tolerant jobs which
periodically save their own state and can restart where they left off.
In this case, files are spooled to the submit machine any time the job
leaves a remote site, either because it exited on its own, or was
evicted by the HTCondor system for any reason prior to job completion.
The files spooled back are placed in a directory defined by the value
of the SPOOL configuration variable. Any output files transferred back
to the submit machine are automatically sent back out again as input
files if the job restarts.



Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing