Hi Again,
Sorry for not explaining my goal.
Soon we will have a few Nvidia GPUs for deep learning jobs, the problem is that jobs will run for a long time probably above 48 hours.
In order to provide reasonable service for all users I will would to enable preemption but I wish to preempt jobs that created a checkpoint in the last 30 minutes.
I'm trying to update a classad using chirp that the negotiator will be able to decide if to preempt the job. for example "Checkpoint = epoch time".
Till now I was unable to publish the modified classad.
Maybe there is a better way to accomplish it?
Many Thanks
David
From: Dudu Handelman <duduhandelman@xxxxxxxxxxx>
Sent: 28 October 2023 19:02 To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>; Greg Thain <gthain@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Update classad and STARTD_JOB_ATTRS Greg.
Sorry i wrote condor_q but obviously its condor_status.
Thanks
David
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Dudu Handelman <duduhandelman@xxxxxxxxxxx>
Sent: Thursday, October 26, 2023 12:25:14 PM To: Greg Thain <gthain@xxxxxxxxxxx>; HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Update classad and STARTD_JOB_ATTRS Thanks Greg.
While using to chirp to update a clasaad the .job.ad file does not update with new value.
I have tried to use STARTD_CRON_AUTOPUBLISH = If_Changed
But
the classad remain with the original value whil looking at the slot with condor_q.
Maybe I need other startd cron knob?
Thanks a million
David
From: Greg Thain <gthain@xxxxxxxxxxx>
Sent: Wednesday, October 25, 2023 11:39:51 PM To: Dudu Handelman <duduhandelman@xxxxxxxxxxx>; HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] Update classad and STARTD_JOB_ATTRS On 10/25/23 11:34, Dudu Handelman wrote:
> Thanks Greg. > We all love knobs :-) > For some reason it's not copy the chirp changes. Tomorrow I will > verify that chirp is writing to the job ad file. Ah -- just to be clear, START_JOB_ATTRS copies the attributes as they exist at job start time, and doesn't update them subsequently, even if chirp updates those same attributes to the copy of the job ad in the schedd. If you dynamically want to change attributes in the startd ad, you'll need startd cron. -greg |