_______________________________________________SYSTEM_JOB_MACHINE_ATTRS is a list of Machine attributes copied from the match ad (which the Negotiator sends to the Schedd) into the Job ad when a job starts running. This is something that the Schedd does and only when a job starts.Â
Â
A change in the execute node of the value of the attribute that SYSTEM_JOB_MACHINE_ATTRS is copying will not be reflected into jobs until the next time a job starts on that machine *as the result of a full negotiation cycle*, so this can take a very long time to propagate, and the value will never change while a job a running.
Â
For something like node health, which can change as the job runs, I think you want to configure STARTD_JOB_ATTRS on the execute node instead of SYSTEM_JOB_MACHINE_ATTRS on the submit node.
Â
-tj
Â
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of ervikrant06@xxxxxxxxx
Sent: Tuesday, June 2, 2020 6:40 AM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] Adding custom job classads on condor_starter nodesÂ
Hello Experts,Â
Â
We are running condor jobs on pre-emptible google cloud instances. I wanted to add something in job classad to identify the jobs impacted because of pre-empted instances.Â
Â
On sched file:Â
Â
SYSTEM_JOB_MACHINE_ATTRS = $(SYSTEM_JOB_MACHINE_ATTRS) nodehealth
Â
on started classAD is advertised.Â
Â
test.example:/etc/condor/config.d# condor_status -compact `hostname` -af machine nodehealth
test.example.com False1Â
I can see the following in job classAD.Â
Â
$ condor_q -run -af jobruncount MachineAttrnodehealth0 MachineAttrnodehealth1
1 False1 undefined
1 False1 undefinedÂ
But when I change the value of classAD (by directly modifying condor configuration and running condor_reconfig) on executor node it's not getting reflected in job definition.Â
Â
I have seen this message in log file. Our executor directory is onlyÂ
Â
06/02/20 06:49:38 slot1_1: Failed to open '/spare/condor/dir_418909/.update.ad.tmp' for writing update ad: No such file or directory (2).
Â
However I do see that .updated.ad file inside the execution directory has the updated value but still machine and job ad reflecting old value as they can't change dynamically.Â
Â
# grep nodehealth .update.ad
nodehealth = "False4"Â
# grep nodehealth .job.ad
MachineAttrnodehealth0 = "False1"Â
# grep nodehealth .machine.ad
nodehealth = "False1"Â
# condor_status -compact `hostname` -af machine nodehealth
test.example.com False4Â
After hold/release job is picking new value but I want to update the value in running instance of job.Â
Â
gone through link [1] but that one also is not useful.
Â
Any input is highly appreciated.Â
Â
Thanks & Regards,
Vikrant Aggarwal
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/