[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] PSLOT preemption - state change Claimed/Busy -> Preempting/Killing - Skipping "Preemting/Vacating"



Hi Joachim,

What are WANT_VACATE and KILL set to on the EP?

-Cole Bollig

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Beyer, Christoph <christoph.beyer@xxxxxxx>
Sent: Wednesday, March 25, 2026 12:40 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] PSLOT preemption - state change Claimed/Busy -> Preempting/Killing - Skipping "Preemting/Vacating"
 
Agreed - maxmachinevacatetime sounds like exactly what you want ....
 

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
 

Von: Joachim <jmeyer@xxxxxxxxxxxxxxxxxx>
An: htcondor-users <htcondor-users@xxxxxxxxxxx>
Datum: Mittwoch, 25. März 2026 18:29 CET
Betreff: Re: [HTCondor-users] PSLOT preemption - state change Claimed/Busy -> Preempting/Killing - Skipping "Preemting/Vacating"

Hey,

seemed like an interesting suggestion :)

I did now also add (test timeouts obviously):

job_max_vacate_time = 120
kill_sig = SIGTERM
kill_sig_timeout = 120
want_graceful_removal = true

But, this also doesn't seem to fix it.. :/

My understanding is that MachineMaxVacateTime (in conjunction with JobMaxVacateTime) should define the distance between the SIGTERM and SIGKILL 

I'm open to trying any other suggestions!

Best,
- Joachim

 

Am 25.03.26 um 17:47 schrieb Beyer, Christoph:
Hi,
 
I think you need to put a signal-wish into the job classadd: 
 
kill_sig = < ... > (e.g. SIGTSTP)
 
A job with this setup will receive the signal you configured in order to preempt followed by a SIGKILL - I fear the time between the two signals is short but you will need to test that.
 
Also I have no idea if you could alter the time between the two signals - the knob maybe missing to do so ...
 
Best
christoph 
 
 
 

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
 

Von: "Joachim Meyer" <jmeyer@xxxxxxxxxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
Gesendet: Mittwoch, 25. März 2026 16:35:52
Betreff: Re: [HTCondor-users] PSLOT preemption - state change Claimed/Busy -> Preempting/Killing - Skipping "Preemting/Vacating"
 

Hi,

thanks for the response!

I did add the "(time() - Target.EnteredCurrentStatus) >= 3000" as a delay of 50min, so already running jobs can run for 50min after a higher prio job comes around.

However, the running job is not informed about it's situation, i.e. it did not get a SIGTERM or anything yet and thus doesn't know it's about to be killed.
I'd love to give those to-be-killed jobs, the 10min VacateTime to potentially write out a checkpoint or so - at least that's my understanding what the Max(Machine)VacateTime is actually intended for:
first send a SIGTERM, wait for VacateTime and then send a SIGKILL.

But that only happens in the Preempting/Vacating action.. in our case, the starter is directly jumping to Preempting/Killing and I don't quite understand why it's skipping the Vacating action.
Not sure if I'm missing something to make the VacateTime take effect?

Cheers,
- Joachim

Am 25.03.26 um 15:04 schrieb Beyer, Christoph:
Hi,
 
I think you should fold the delay into the preemption_requirement rather because as soon as the requirement is fullfilled it will preempt - we do not use preemption actively hence this is just a wild guess ... 
 
 
Best
christoph
 

--
Christoph Beyer
DESY Hamburg
IT-Department

Notkestr. 85
Building 02b, Room 009
22607 Hamburg

phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
 

Von: "Joachim Meyer" <jmeyer@xxxxxxxxxxxxxxxxxx>
An: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Gesendet: Mittwoch, 25. März 2026 14:55:56
Betreff: [HTCondor-users] PSLOT preemption - state change Claimed/Busy -> Preempting/Killing - Skipping "Preemting/Vacating"
 

Hi all,

we are trying to implement preemption for a small subset of our nodes.
All our nodes have a single partitionable slot - most with GPUs (we actually only care about GPU nodes for preemption).
We are on 24.12across all EPs, APs and negotiator.

The setup generally seems to work, with 
For the negotiator:
ALLOW_PSLOT_PREEMPTION = True
PREEMPTION_REQUIREMENTS = Target.AcctGroup =?= "prio" && !regexp("prio\..+@", My.AccountingGroup) && (time() - Target.EnteredCurrentStatus) >= 3000 && My.Machine =?= "the_special_machine"

On the EP:
RANK = 0
MachineMaxVacateTime = 60*10
MAXJOBRETIREMENTTIME = 0
START = ... && (Target.MaxJobRetirementTime == 0 || AcctGroup =?= "prio")

So, we basically get a guaranteed access delay of 1hour for a group "prio" to their machine.
That's at least the idea.

The 3000s max in the queue is working as I intended it to, however, the dynamic slots get immediately killed, once the preemption decision was made...
No vacate time at all - even though setting this to 60*10.
When condor_hold-ing jobs, they do get a SIGTERM first, and then a SIGKILL after a timeout - I would expect this to be similar here, from what I understand..?

We do actively set MAXJOBRETIREMENTTIME to 0, as to my understanding, otherwise we would not get a guaranteed start after 1hr (as a new job might always snuggle in, with the PSLOT preemption).
But I at least want the jobs to have their vacate time, before going from SIGTERM to SIGKILL...

The StartLog from the EP shows:

 

03/25/26 12:04:10 slot1: Schedd addr = <10.143.248.61:9618?addrs=10.143.248.61-9618&alias=conduit2&noUDP&sock=schedd_2302500_1092>
03/25/26 12:04:10 slot1: Alive interval = 300
03/25/26 12:04:10 slot1: Schedd sending 5 preempting claims.
03/25/26 12:04:10 slot1_2: Canceled ClaimLease timer (1859)
03/25/26 12:04:10 slot1_2: Changing state and activity: Claimed/Busy -> Preempting/Killing
03/25/26 12:04:10 ResMgr  update_needed(0x2) -> 0x2 queuing timer
03/25/26 12:04:10 slot1_2: unbind DevIds for slot1.2 before : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1_3, GPU-de2e358d=1_4, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }
03/25/26 12:04:10 slot1_2: unbind DevIds for slot1.2 after : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1_3, GPU-de2e358d=1_4, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }
03/25/26 12:04:10 slot1_3: Canceled ClaimLease timer (1876)
03/25/26 12:04:10 slot1_3: Changing state and activity: Claimed/Busy -> Preempting/Killing
03/25/26 12:04:10 ResMgr  update_needed(0x2) -> 0x2 timer already queued
03/25/26 12:04:10 slot1_3: unbind DevIds for slot1.3 before : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1_3, GPU-de2e358d=1_4, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }
03/25/26 12:04:10 slot1_3: ubind DevIds for slot1.3 unbind GPU-937c3d9c 1 OK
03/25/26 12:04:10 slot1_3: unbind DevIds for slot1.3 after : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1, GPU-de2e358d=1_4, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }
03/25/26 12:04:10 slot1_4: Canceled ClaimLease timer (1878)
03/25/26 12:04:10 slot1_4: Changing state and activity: Claimed/Busy -> Preempting/Killing
03/25/26 12:04:10 ResMgr  update_needed(0x2) -> 0x2 timer already queued
03/25/26 12:04:10 slot1_4: unbind DevIds for slot1.4 before : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1, GPU-de2e358d=1_4, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }
03/25/26 12:04:10 slot1_4: ubind DevIds for slot1.4 unbind GPU-de2e358d 1 OK
03/25/26 12:04:10 slot1_4: unbind DevIds for slot1.4 after : GPUs:{GPU-694f8794=1_7, GPU-937c3d9c=1, GPU-de2e358d=1, GPU-4353d652=1_5, GPU-a9035eaf=1_6, GPU-036117e2=1_7, GPU-7e6bcf8a=1_7, GPU-db1fcb33=1_7, }

Can anyone give advice on whether it is possible with the pslot preemption to implement a policy that gives guaranteed maximum access delays for the "prio"group / the owner of a machine?
And more specifically, is there anything I am missing, to make jobs get their vacate time granted, when being preempted?

Thanks!
- Joachim Meyer

--

Joachim Meyer
HPC-Koordination & Support

Universität des Saarlandes FR Informatik | HPC

Postanschrift: Postfach 15 11 50 | 66041 Saarbrücken

Besucheranschrift: Campus E1 3 | Raum 4.03 66123 Saarbrücken

T: +49 681 302-57522 jmeyer@xxxxxxxxxxxxxxxxxx www.uni-saarland.de


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
--

Joachim Meyer
HPC-Koordination & Support

Universität des Saarlandes FR Informatik | HPC

Postanschrift: Postfach 15 11 50 | 66041 Saarbrücken

Besucheranschrift: Campus E1 3 | Raum 4.03 66123 Saarbrücken

T: +49 681 302-57522 jmeyer@xxxxxxxxxxxxxxxxxx www.uni-saarland.de


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
--

Joachim Meyer
HPC-Koordination & Support

Universität des Saarlandes FR Informatik | HPC

Postanschrift: Postfach 15 11 50 | 66041 Saarbrücken

Besucheranschrift: Campus E1 3 | Raum 4.03 66123 Saarbrücken

T: +49 681 302-57522 jmeyer@xxxxxxxxxxxxxxxxxx www.uni-saarland.de