2. Are the reasons for exec()-ing the user job rather than fork()-ing the
following?:
- So that Condor 'knows' which process (PID) to send the Unix
control signals to cause the job to suspend, checkpoint or
vacate as necessary?
...and so I need to know what signals Condor will send to the user job -
trawling the manual seems to reveal the following:
- SIGUSR2:
cause a job in the Standard universe to checkpoint and then continue
executing.
- SIGTSTP (or the value of the KillSig ClassAd attribute):
cause a job in the Standard universe to try and gracefully shutdown
(i.e. checkpoint).
- SIGTERM (or the value of the KillSig ClassAd attribute):
cause a job in the Vanilla universe to try and gracefully shutdown,
i.e. normal Unix termination (noting that the program may catch
SIGTERM and try to clean up). Is this also true for jobs in the other
non-Standard (Java, MPI, PVM and Scheduler) universes?
- SIGKILL:
kill (i.e. send the hard-kill signal to) the job, if the job takes too
long to gracefully shutdown or doesn't respond to the appropriate
signal.
...but what about when it suspends a user job? Does it send it a SIGSTOP?
Does it do anything else (as wel/instead of)?
...and similarly when it unsuspends a user job does it send a SIGCONT?
Does it do anything else (as well/instead of)?