Dear all,
In our HTCondor cluster running 9.0.12 we have a few machines with GPUs.
We would like to be sure that the users requestingÂGPUs are really using them and for that reason, we are interestedÂin creating some _expression_ that says something like if after 4 hours the GPU average usage is 0.0, the job will be held.Â
OurÂfirst doubt is where we can extract the GPU average usage. There is the DeviceGpusAverageUsage and the documentation says that it counts the GPU used by the slot against the time theÂstartd started up. However, there is a GpusAverageUsage that most of the time is undefined but we have seen it not undefined in some cases with values slightly different from DeviceGpusAverageUsage. What is the difference between DeviceGpusAverageUsage and GpusAverageUsage?Â
Thank you in advance.
Best regards,
Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10