[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Partitionable Slots



Howdy -

We have been using partitionable slots to run multi-core jobs for the
last few months.  We are set up to have a single partitionable slot
and no static slots, divided by CPU.   Our users are submitting a mix
of jobs, using request_cpus to select the size of slot desired.

When initially turned on, it works.  Slots get created for the exact
size of each job, so that, for example, a two-core job is matched to a
two-core slot.  However, after a while, jobs begin to be matched in
slots that are too big.  For example, we see lots of one-cpu jobs
running on 4-cpu slots.

How do we fix this so that jobs only run in slots of the appropriate size?

(Based on some previous discussions, we set CLAIM_WORKLIFE=0, so as to
force claims to expire at the end of each job, thus causing them to be
returned to the parent partitionable slot.  But, that doesn't seem to
be happening.)

The relevant configuration is:

NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = cpus=100%
SLOT_TYPE_1_PARTITIONABLE = true
CLAIM_WORKLIFE = 0

Any suggestions?

Doug

P.S. We have a neat little display that shows slot size based on CPU:
http://condor.cse.nd.edu/condor_matrix.cgi