[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Arbitrarily limit the number of dynamic slots



On 1/18/2023 9:06 AM, Charles Goyard wrote:
Hi all,


I just started deploying dynamic slots on my pool (Debian, HTCondor 10),
and it works well.  However, I would like to set an arbitrary limit on
the number of slots than can be created on a single machine.

This is because our sweet spot is somewhere around 6 to 8 jobs running
in parallel.

I could use predefined, static slots to the same effect, but I was
wondering if there is a way to keep dynamic slots and state : "this
machine can be cut in 6 at most".

My setup is very simple for dynamic slots, as it is only :

use feature:PartitionableSlot

Hi Charles,

To achieve the above desired policy, you could leverage the fact that all slots have an attribute "TotalSlots" which is the number of slots being advertised by that startd.  So perhaps just a startd START _expression_ to look at this, keeping in mind that the partitionable slot itself counts as one, so to have 6 jobs at most running you want TotalSlots kept to be at 7 or below.

So you could drop the following into your config  (and do a condor_reconfig or restart of the startd):
# Setup up the Execution Point (EP, i.e. the startd) to use dynamic slots, but
# never run more than 6 jobs at most (no matter how the EP is carved up).
use feature:PartitionableSlot
START = $(START) && ( TotalSlots <= 7 ) 

How did I come up with the above?  I just did a "condor_status -l" on a partitionable slot machine and looked at the piles of attributes I could reference :).  Instead of a strict scalar comparison, you could config it to be a function taking into account any attribute in the slot ad (number of cores, type of processors, amount of RAM, whatever), and/or you can customize it for each and every server if you wanted.  Documentation about the "START" config knob is at:

   https://htcondor.readthedocs.io/en/latest/admin-manual/policy-configuration.html#the-start-_expression_

but basically the START config knob is a classad _expression_ that must evaluate to True in order for the startd to allow a new job activation.


Hope the above helps,
regards,
Todd T

--
Todd Tannenbaum <tannenba@xxxxxxxxxxx>  University of Wisconsin-Madison
Center for High Throughput Computing    Department of Computer Sciences
Calendar: https://tinyurl.com/yd55mtgd  1210 W. Dayton St. Rm #4257