[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Huge pile of jobs in "C" state



Steffen,

There are two likely candidates that come to mind. Either the jobs
have LeaveJobInQueue set, or they were submitted via a SOAP call and
they're still waiting for FilesRetrieved to be true (even if there's
not actually anything to do).

condor_q -const 'JobStatus==4' -af ClusterId ProcId LeaveJobInQueue
FilesRetrieved

Should give you an indication of which, if either, of those theories
is correct. The short-term solution would be to use condor_qedit to
change the appropriate attribute. The long-term solution, of course,
is to figure out where the errant attribute is coming from and correct
it. If it's a SOAP submission, you may just set SOAP_LEAVE_IN_QUEUE to
false (assuming that's appropriate).

For what it's worth, I seem to recall seeing this behavior when moving
from 7.8 to 8.0, but at the time I assumed it was because of the
constantly-changing configuration of my test setup.


Thanks,
BC

-- 
Ben Cotton
main: 888.292.5320

Cycle Computing
Better Answers. Faster.

http://www.cyclecomputing.com
twitter: @cyclecomputing