[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] having multiple schedulers and collectors



is there a way to see if the schedd is backed up? How can I see the
real status of it?

It seems when I submit many jobs (even not running), I get this problem.


On Wed, May 26, 2010 at 2:16 PM, Matthew Farrellee <matt@xxxxxxxxxx> wrote:
> On 05/25/2010 09:05 PM, Mag Gam wrote:
>> Previously, I had 1 scheduler and 1 collector for 2000 nodes (each
>> with 16 core) giving me 32000 slots. Everything was functioning fine,
>> however I used to get a lot of '???????' when I did condor_q -run .
>>
>> Recently, I added an extra scheduler and a collector to complement my
>> previous scheduler. I noticed the '?????' is completely gone!  I was
>> wondering if there was a relation between this problem and having an
>> extra collector and scheduler in my pool.
>
> It's entirely possible that the "???"s were because your Schedd was backed up. It could have marked the jobs as running but info about the Startd where the job was running had not made its way back yet.
>
> Best,
>
>
> matt
>