[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Prioritizing jobs using the RANK expression on the execute point (EP) side



Hi Arshad,

The Condor negotiator only compares one job/autocluster to all available StartDs and makes a decision, before continuing with the next job/autocluster. It does not compare all jobs to one or even all available StartDs. So the StartDs cannot really influence which jobs they prefer for regular negotiation since they only ever get offered one at a time.
By default, the StartD RANK is just evaluated as part of NEGOTIATOR_PRE_JOB_RANK which decides which node the currently to-be-scheduled job prefers.

So even if you get the StartD RANK evaluated for each job instead of autocluster, it wonât do what you want.

You will either have to switch on PREEMPTION â in which case the StartD gets to decide between the current and candidate job â or have other machines that the non-preferred jobs actually prefer.

Cheers,
Max

On 12. Sep 2025, at 17:36, Arshad Ahmad via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

Hello HTCondor team,
We are testing scheduling behavior with a subset of execute nodes in our cluster. These nodes used to be dedicated to a single group, but now we want them to be preferred for that group while still being available to others. To do this, we are experimenting with the RANK function.
We configured our test EP so that preferred jobs receive rank 1000, while jobs receive rank 0. PREEMPTION_REQUIREMENTS is set to false.
In the test, we first submitted 56 non-preferred jobs, each a 10-minute sleep job. Since these were 4-core jobs, 28 of them filled the node completely, leaving the other 28 waiting idle in the queue. After that, we submitted 56 preferred jobs. Our expectation was that when the first batch of non-preferred jobs finished, some of the preferred jobs would start, since they have a higher rank.
What actually happened was that more none-preferred jobs from the same autocluster started, and this continued until all non-preferred jobs had run. Only then did the preferred jobs begin running. From looking at the STARTD log, the rank evaluation itself appears to be working correctly (preferred jobs at 1000, non-preferred jobs at 0). But it seems that once the first negotiation happens, the schedd obtains a claim on the partitionable slot for the whole non-preferred autocluster. That claim does not get relaxed or re-negotiated as jobs complete, so the preferred jobs donât get a chance to match until the non-preferred autocluster is drained.
Are we interpreting this behavior correctly? And is there a recommended configuration that would cause rank preferences to be re-evaluated as jobs finish, instead of holding the claim for the entire autocluster?
Thank you,
Arshad

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

Attachment: smime.p7s
Description: S/MIME cryptographic signature