We're running a fairly small condor cluster (16 dual core cpus, so
32nodes) on machines that are also general purpose compute servers.
I think we're running into problems where condor is marking a CPU as
busy when users are running other (non-condor) processes on the machines.
I think the confusion is because some users are bypassing condor and
running their jobs directly on the machines. This causes CpuIsBusy =
TRUE (e.g., condor_status -l s11) and prevents these machines from
getting matched to jobs. Meanwhile, condor_q reports these machines
misleadingly as idle.
Is there a way around this? I've read the Preemption and scheduling
sections of the manual, and they all appear to deal with how to handle
scheduling WITHIN condor. Is there a way to make condor's threshold for
flagging a CPU as busy significantly higher? How about increasing the
priority that ALL condor jobs run as?