HTCondor Project List Archives



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-devel] Job Suspension



On Tuesday, April 26, 2011 at 9:46 AM, Brian Bockelman wrote:

Hi folks,

I have a few observations about job suspension:
1) Whether or not the job is currently suspended is not recorded in the ClassAd.
2) When a job is transitioned from running to suspended, LastSuspensionTime is updated. However, due to (1), you don't know whether the job has since been un-suspended.
3) The starter updates the remote wall time, but not the suspended time.

I would like to preempt jobs based upon the non-suspended running time. However, it doesn't appear that this is possible in the current setup.

Why is the suspension state not reflected in the job's classad? It seems like a very important thing to note.
Preempt them how? Using the RANK or PREEMPT expressions on the machine or the PREEMPTION_REQUIREMENTS _expression_ at the negotiator?

All of those are evaluated in the context of the machine ad so you can use the machine state and activity attributes of the machine to determine if a job is suspend on the machine or running.

   State == "Claimed" && Activity == "Suspended"

Indicates a job is in the suspended state on the machine.

Technically you only really need to look for Activity == "Suspended" because the state machine should never have that Activity value mixed with any other state.

Though, having said all of that, a job status code for "suspend" that distinguishes it from running (JobStatus == 2) wouldn't be a bad idea.

Regards,
- Ian

-- 
Ian Chesal
ichesal@xxxxxxxxxxxxxxxxxx
http://www.cyclecomputing.com/