[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Prioritizing jobs using the RANK expression on the execute point (EP) side



Hey Arshad,

I hit a similar issue a while back as well. Take a look at the CLAIM_WORKLIFE [1] variable. Setting it to 0 will expire the claim after the job exits, while the default is 20 minutes (so it might have actually worked the way you intended if you had submitted enough jobs to let the claim expire?). From my understanding the intention is to let multiple small jobs quickly reuse the claimed space without having to go through negotiation cycles again, but I found it easier to show people who purchased hardware how their jobs get preferential treatment with it off.

-Zach

Reference URLs:
1. https://htcondor.readthedocs.io/en/latest/admin-manual/configuration-macros.html#CLAIM_WORKLIFE

________________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Arshad Ahmad via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Sent: Friday, September 12, 2025 8:36 AM
To: HTCondor-Users Mail List
Cc: Arshad Ahmad
Subject: [HTCondor-users] Prioritizing jobs using the RANK expression on the execute point (EP) side

Hello HTCondor team,
We are testing scheduling behavior with a subset of execute nodes in our cluster. These nodes used to be dedicated to a single group, but now we want them to be preferred for that group while still being available to others. To do this, we are experimenting with the RANK function.
We configured our test EP so that preferred jobs receive rank 1000, while jobs receive rank 0. PREEMPTION_REQUIREMENTS is set to false.
In the test, we first submitted 56 non-preferred jobs, each a 10-minute sleep job. Since these were 4-core jobs, 28 of them filled the node completely, leaving the other 28 waiting idle in the queue. After that, we submitted 56 preferred jobs. Our expectation was that when the first batch of non-preferred jobs finished, some of the preferred jobs would start, since they have a higher rank.
What actually happened was that more none-preferred jobs from the same autocluster started, and this continued until all non-preferred jobs had run. Only then did the preferred jobs begin running. From looking at the STARTD log, the rank evaluation itself appears to be working correctly (preferred jobs at 1000, non-preferred jobs at 0). But it seems that once the first negotiation happens, the schedd obtains a claim on the partitionable slot for the whole non-preferred autocluster. That claim does not get relaxed or re-negotiated as jobs complete, so the preferred jobs don’t get a chance to match until the non-preferred autocluster is drained.
Are we interpreting this behavior correctly? And is there a recommended configuration that would cause rank preferences to be re-evaluated as jobs finish, instead of holding the claim for the entire autocluster?

Thank you,

Arshad