Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] History file with incorrect JobCurrentStartDate? - version windows 8.4.4
- Date: Thu, 5 Apr 2018 15:46:16 +0000
- From: John M Knoeller <johnkn@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] History file with incorrect JobCurrentStartDate? - version windows 8.4.4
We've done some digging in the code, and the only thing we can think that would explain this is if the clock jumped ahead by 2 days
just at the time the job started, then it jumped back again for the remainder of the job execution.
JobCurrentStartDate is set by the schedd at the time it creates the shadow,
the rest of the numbers are set or calculated by the shadow based on the current system clock time.
So in order for this to happen, the schedd would need to get 1521637146 from the system clock when it creates the shadow
and the shadow would need to get 1521445871 from the system clock when the job actually begins running just a few seconds
later. Otherwise, we can't explain how the CommittedSlotTime is negative - it is calculated by the shadow which means that the shadow
would have to SEE the JobCurrentStartDate that is in the future relative to it's own clock.
If this were a case of all of the values but JobCurrentStartDate being stale somehow, then the CommittedSlotTime would not be negative.
-tj
-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of John M Knoeller
Sent: Thursday, April 5, 2018 10:12 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] History file with incorrect JobCurrentStartDate? - version windows 8.4.4
This is not a known bug. but it seems similar to this bug
https://htcondor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6626
which will be fixed in the upcoming 8.6.11 release.
could you send me the entire job ad so I can have a look?
thanks
-tj
-----Original Message-----
From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of Greg.Hitchen@xxxxxxxx
Sent: Thursday, April 5, 2018 2:00 AM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] History file with incorrect JobCurrentStartDate? - version windows 8.4.4
Hi All
We have a number of windows submit nodes in our HTCondor pool.
We have reporting scripts on a linux machine that download all the history files and can
provide monthly (or daily) usage on a per user basis.
In our latest report for March 2018 we noticed something strange, -ve run times!
After tracking back through the scripts we eventually found the answer in the history files themselves.
e.g. for one job these are the relevant values in the history file
JobCurrentStartDate 1521637146
CompletionDate 1521450290
EnteredCurrentStatus 1521450290
JobCurrentStartTransferOutputDate 1521450290
JobFinishedHookDone 1521450290
LastJobLeaseRenewal 1521450290
JobCurrentStartExecutingDate 1521445871
LastMatchTime 1521445870
LastVacateTime 1521444418
JobLastStartDate 1521442616
LastRejMatchTime 1521436370
JobStartDate 1521192534
QDate 1521181266
CumulativeSlotTime 45124
RemoteWallClockTime 45124
RemoteUserCpu 2970
RemoteSysCpu 1286
CommittedSuspensionTime 0
CumulativeSuspensionTime 0
CommittedSlotTime -186856
CommittedTime -186856
Note the 2 -ve committed time values. These are equal to (CompletionDate - JobCurrentStartDate).
In fact the JobCurrentStartDate is nearly 2 days AFTER the CompletionDate!?
For the moment we will need to change our scripts to NOT use the CommittedSlotTime BUT
calculate it from (CompletionDate - JobLastStartDate).
Is this a known bug? Or something that someone has come across before?
I've had a look through the 8.2.* and 8.4.* release notes and bug fixes but couldn't see anything.
Thanks for any help.
Cheers
Greg
P.S. we will soon(ish) be upgrading our submit nodes from win2008 to win2016 and will upgrade
to the latest 8.6.* version then. Meanwhile the history files already exist so we will kludge
a workaround.
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/