Hello,
I’m running an HTCondor pool in which all machines are available to run
tasks at all times, i.e., they do not become unavailable when the
keyboard or mouse is moved on those machines. I’m interested to know
what the best practice is for setting up resource slots on these
machines. By default, HTCondor seems to create one slot per core and
evenly divide the machine’s memory among the slots. For example, one of
the machines in the pool has 8 GB of RAM and 2 cores, and HTCondor
created two slots on that machine, each with 1 core and 4 GB of RAM.
My worry is that in the event that both of these slots are 100% utilized
by user-submitted jobs, this leaves no cores or memory free for the
operating system itself, the Condor daemons, etc. What is the standard
practice here? Do HTCondor pool administrators typically customize the
slot allocation on worker machines to leave 1 or 2 cores and some
fraction of RAM free for the OS itself, or is HTCondor’s default
behavior of evenly dividing all the resources of the machine among the
job slots considered to be a reasonable default?