HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-devel] KILLING_TIMEOUT in 7.6.0



Hi all,

I'm wondering if the change in behavior of preemption in 7.5 was really intended to be as broad as it is. Now, the KILL expression is useless by default, because when a job is preempted, it is hard-killed as soon as KILLING_TIMEOUT expires. The default KILLING_TIMEOUT is just 30s. We've received complaints in CHTC from users who have self-checkpointing jobs that cannot save state in this amount of time. I will crank the knob higher, but I consider this a workaround, not a resolution.

This new behavior came from the following:

https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1198

Prior to this, KILLING_TIMEOUT was the timeout applied to the Preempting/Killing activity. Now, it effectively applies to Preempting/Vacating as well.

I see no release notes warning admins of this important change.

--Dan