[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] signals when a job is killed



Dear List:

 

I am running jobs on the Open Science Grid through the Xsede submit host.

When one of my jobs gets killed/bumped, I would like to catch the signal which is sent to kill it if possible and exit gracefully.

If I can do so, I will have my command script kill any running child processes and exit so that the results which have been saved up to that point will be returned.

This will markedly improve the efficiency of my jobs under circumstances where many are being bumped and will sidestep the awkwardness and piecemeal approach of checkpointing.

 

Is  there a signal sent?

If so, is it sent to the mother process of my job and which signal is used?

Is there a way to control which signal is used – it would be simplest to catch SIGINT, signal 2.

And finally, is the function which returns results of the jobs disabled when a job is killed?

 

Regards,

 

Don

 

Signature0001

Don Krieger, Ph.D.

Department of Neurological Surgery

University of Pittsburgh