Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Arbitrarily limit the number of dynamic slots
- Date: Wed, 18 Jan 2023 15:22:30 -0600
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Arbitrarily limit the number of dynamic slots
On 1/18/2023 9:06 AM, Charles Goyard
wrote:
Hi all,
I just started deploying dynamic slots on my pool (Debian, HTCondor 10),
and it works well. However, I would like to set an arbitrary limit on
the number of slots than can be created on a single machine.
This is because our sweet spot is somewhere around 6 to 8 jobs running
in parallel.
I could use predefined, static slots to the same effect, but I was
wondering if there is a way to keep dynamic slots and state : "this
machine can be cut in 6 at most".
My setup is very simple for dynamic slots, as it is only :
use feature:PartitionableSlot
Hi Charles,
To achieve the above desired policy, you could leverage the fact
that all slots have an attribute "TotalSlots" which is the number of
slots being advertised by that startd. So perhaps just a startd
START _expression_ to look at this, keeping in mind that the
partitionable slot itself counts as one, so to have 6 jobs at most
running you want TotalSlots kept to be at 7 or below.
So you could drop the following into your config (and do a
condor_reconfig or restart of the startd):
# Setup up the Execution Point (EP, i.e. the startd) to use dynamic slots, but
# never run more than 6 jobs at most (no matter how the EP is carved up).
use feature:PartitionableSlot
START = $(START) && ( TotalSlots <= 7 )
How did I come up with the above? I just did a "condor_status -l"
on a partitionable slot machine and looked at the piles of
attributes I could reference :). Instead of a strict scalar
comparison, you could config it to be a function taking into account
any attribute in the slot ad (number of cores, type of processors,
amount of RAM, whatever), and/or you can customize it for each and
every server if you wanted. Documentation about the "START" config
knob is at:
https://htcondor.readthedocs.io/en/latest/admin-manual/policy-configuration.html#the-start-_expression_
but basically the START config knob is a classad _expression_ that
must evaluate to True in order for the startd to allow a new job
activation.
Hope the above helps,
regards,
Todd T
--
Todd Tannenbaum <tannenba@xxxxxxxxxxx> University of Wisconsin-Madison
Center for High Throughput Computing Department of Computer Sciences
Calendar: https://tinyurl.com/yd55mtgd 1210 W. Dayton St. Rm #4257