Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [Condor-users] schedd problems?
- Date: Thu, 24 Feb 2005 11:34:31 -0500
- From: "Ian Chesal" <ICHESAL@xxxxxxxxxx>
- Subject: RE: [Condor-users] schedd problems?
> Hi,
> I've got a strange problem (aren't they all?), and could use
> guidance on how to figure out what's wrong. I have a submit
> machine that can no longer tell what jobs are in it's own
> queue. I upgraded condor to 6.7.3 (from 6.6.7) on Feb 10;
> yesterday (Feb 23), it was noticed that condor_q would return:
>
> -- Failed to fetch ads from: <129.89.201.232:38456> :
> hydra.phys.uwm.edu
>
> SchedLog doesn't seem to show anything interesting...
>
> How can I debug what's failing?
Hi Paul,
We've seen similar messages when a single schedd instance has LOTS of
ports open in the 6.7.3 builds. Can you check the number of open network
connections on the machine? Is the schedd currently preempting a lot of
startd machines in your cluster?
- Ian