John (TJ) Knoeller wrote:
If you are running HTCONDOR 7.9.5 or later, you can run
condor_q -better-analyze -reverse -machine <slotname> <jobid>
And if not, run condor_status and check the system loads. If idle nodes
have loads of 1.0 or higher then check to see what processes are running
on those nodes and eating CPU. I recently had to go through my pool to
disable the avahi-daemon process. The Avahi daemon has a tendency to go
stupid and lock up a CPU core. Killing the daemons freed up about half
the nodes in my pool.