[Condor-users] condor_q -global stressing our schedulers?


Date: Fri, 4 Feb 2005 14:29:58 -0500
From: "Ian Chesal" <ICHESAL@xxxxxxxxxx>
Subject: [Condor-users] condor_q -global stressing our schedulers?
I'm getting a lot of timeouts from our schedd machines if I call them
directly with condor_q and I have a sneaky suspicion that it's due to a
fair number of users of our system calling condor_q -global in scripts
that parse the output to display overall system state.

Is it possible for condor_q -global to stress schedulers to the point
where calls for queue status start getting dropped? I'm even seeing just
"condor_q" on the larger schedd machines (machines with 1000+ jobs
queued) issue "failed to fetch ads" messages. This is all with 6.7.3 on
a mix of Windows and Linux machines.

Is there a recommended (non-stressful) way for me to guide my users
towards so they can see who's has what running, queued and held in the
system and what the JobPrio of those jobs are?

- Ian


--
Ian R. Chesal <ichesal@xxxxxxxxxx>
Senior Software Engineer

Altera Corporation
Toronto Technology Center
Tel: (416) 926-8300




[← Prev in Thread] Current Thread [Next in Thread→]