HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] KILLING_TIMEOUT in 7.6.0




There is now a ticket for this issue:

https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=2142

On 5/6/11 8:31 PM, Dan Bradley wrote:
Hi all,

I'm wondering if the change in behavior of preemption in 7.5 was really intended to be as broad as it is. Now, the KILL expression is useless by default, because when a job is preempted, it is hard-killed as soon as KILLING_TIMEOUT expires. The default KILLING_TIMEOUT is just 30s. We've received complaints in CHTC from users who have self-checkpointing jobs that cannot save state in this amount of time. I will crank the knob higher, but I consider this a workaround, not a resolution.

This new behavior came from the following:

https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1198

Prior to this, KILLING_TIMEOUT was the timeout applied to the Preempting/Killing activity. Now, it effectively applies to Preempting/Vacating as well.

I see no release notes warning admins of this important change.

--Dan

_______________________________________________
Condor-devel mailing list
Condor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel