[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Unable to unset monitors in claim destructor. The StartOfJob* attributes will be stale.



08/18/19 15:26:23 slot2_1: Unable to unset monitors in claim destructor. The StartOfJob* attributes will be stale. (0x1d215c0, (nil))
	If GPU monitoring is turned on, GPU usage information is recorded 
in the slot ad, and assigned to a job as it runs on that slot.  When a job 
starts, we record the slot's current usage in the slot ad; then we compute 
the job's usage by substracting this from the ongoing accumulation of 
usage, until the job ends.  Of course, it the claim is deleted, we need to 
make sure that the information we recorded about the start of the job is 
deleted, too; otherwise, the slot will report usage for a job that's no 
longer running.  (This won't screw up accounting, because that only counts 
assigned resources, not actual usage.)
	However, in some cases, a claim will be deleted whose ClassAd has 
already been deleted.  In those cases, we can't (presently) determine 
which monitors to unset, and so we do nothing.  This _should_ only happen 
when the slot is being destroyed, in which case it's harmless, but I've
been unable to prove that's the case.

However, in the course of refreshing my memory about this, Jaime found a place in the code where a one-line change might substantially reduce the occurrence of these warnings; we'll see how that goes.
- ToddM