Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Trying to figure out how rank works when submitting
- Date: Fri, 12 Jun 2020 17:22:15 +0000
- From: Michael Pelletier <michael.v.pelletier@xxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Trying to figure out how rank works when submitting
Hi Greg, thanks for the clarification.
I'm trying to do something similar to Fabrice, except with large-memory and GPU-equipped machines, instead of different countries, and I'm having some trouble getting my pre-job-rank to steer things as expected, and I'm not sure why.
To keep things simple, I'll focus on my wish to have non-GPU jobs only consider GPU machines as a last possible resort.
To that end, I have a clause in my pre-job rank to reduce the rank of all GPU-equipped machines:
( -10e6 * (!isUndefined(MY.TotalGPUs) && MY.TotalGPUs > 0) )
Since jobs which require a GPU will only match to machines which have a GPU, this expression simply sets a -10e6 baseline rank for all eligible machines for a GPU job, and thus makes no difference in the calculations when applied to GPU-required jobs.
I am using partitionable slots, and I have claim_partitionable_leftovers set to true, through the 8.8 "use feature : PartitionableSlot(1)" config.
I have 16 non-GPU machines and 4 GPU machines.
When I submit a "sleep" job consisting of 20 procs:
Executable = /bin/sleep
Arguments = 5m
Queue 20
... instead of all the jobs ending up on non-GPU machines, some of them are matched to GPU machines.
My understanding from the manual is that only the top machines with equal pre-job ranks will be considered by the job rank, then the post-job rank expressions.
I had thought this might have something to do with the group_quota_round_robin_rate based on the description of the value in the manual, " Setting GROUP_QUOTA_ROUND_ROBIN_RATE to a value that is small compared to the size of subsets of machines" - but setting the rate to 16, or 12, didn't seem to prevent my jobs from matching GPU machines.
I'd appreciate any insights.
Michael V Pelletier
Principal Engineer
Raytheon Technologies
Information Technology
Digital Transormation & Innovation