[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Half of the slots are remaining 'owner' on all machines



John (TJ) Knoeller wrote:
> If you are running HTCONDOR 7.9.5 or later, you can run
> 
> condor_q -better-analyze -reverse -machine <slotname> <jobid>

And if not, run condor_status and check the system loads. If idle nodes
have loads of 1.0 or higher then check to see what processes are running
on those nodes and eating CPU. I recently had to go through my pool to
disable the avahi-daemon process. The Avahi daemon has a tendency to go
stupid and lock up a CPU core. Killing the daemons freed up about half
the nodes in my pool.

-- 
Rich Pieri <ratinox@xxxxxxx>
MIT Laboratory for Nuclear Science