Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Job CPU usage updates
On Tue, Sep 24, 2013 at 1:26 AM, Wilkins, David
<David.Wilkins@xxxxxxxxxxxxxx> wrote:
> Is there anything in the job ClassAds that would tell us that the update has
> occurred?
You could use the job's CommittedTime as a check. If the CommittedTime
is greater than STARTER_UPDATE_INTERVAL (and I'd add a little extra
padding to be sure), then the job may be in a hung state like you
describe. The caveat here is that if you ever increase
STARTER_UPDATE_INTERVAL, you'll want to make sure to adjust your
submit file accordingly.
Here's the description of CommittedTime from Appendix A of the manual.
CommittedTime:
The number of seconds of wall clock time that the job has been
allocated a machine, excluding the time spent on run attempts that
were evicted without a checkpoint. Like RemoteWallClockTime, this
includes time the job spent in a suspended state, so the total
committed wall time spent running is
CommittedTime - CommittedSuspensionTime
--
Ben Cotton
Senior Support Engineer
Cycle Computing, LLC
The Leader in Utility Supercomputing Software