[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] aggregating job statistics over all job instances



Hi Thomas,

At the moment there isn't that elegant of a solution as you either need to set up a job post run analysis or have some other program/script that understands job states to query information about jobs. As Matthew stated, the python bindings may be useful for this sort of monitoring/job management.

However, I have been working on a feature for this exact query for a bit now where the shadow writes the job ad to file like normal job history. It should hopefully be officially announced within a couple of feature series releases.

Cheers,
Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Thomas Hartmann <thomas.hartmann@xxxxxxx>
Sent: Thursday, January 19, 2023 10:03 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] aggregating job statistics over all job instances
 
Hi all,

I would like to collect job metrics for all runs of a job.  So far my
approach would be a postCMD - but that seems not very elegant/condor-like.
I.e., a postCMD script following the payload job, that chirps a class ad
array, attaches a new element and chirps the updated job ad again to the
collector. However, one would need to be careful to not drop user postCMDs.

Maybe there is a better way?

Cheers,
  Thomas