Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] timeout reading buffer
- Date: Tue, 28 Feb 2006 16:00:53 -0500
- From: Preston Smith <psmith@xxxxxxxxxx>
- Subject: [Condor-users] timeout reading buffer
Right as our condor pools reach about 100% capacity, one of the busiest
schedds basically stops running jobs.. almost all run down to idle..
The negotiator logs:
2/28 15:44:45 Got NO_MORE_JOBS; done negotiating
2/28 15:44:45 Negotiating with user@xxxxxxxxxxxxxxx at
<128.211.128.11:59684>
2/28 15:45:15 condor_read(): timeout reading buffer.
2/28 15:45:15 Failed to get reply from schedd
2/28 15:45:15 Error: Ignoring schedd for this cycle
condor_q on that schedd shows:
3342 jobs; 3330 idle, 10 running, 2 held
ShadowLog on 128.211.128.11 shows:
2/28 15:48:08 (21939.0) (32200): condor_read(): timeout reading buffer.
2/28 15:48:08 (21939.0) (32200): AUTHENTICATE: handshake failed!
2/28 15:48:08 (21939.0) (32200): Authentication Error
AUTHENTICATE:1002:Failure performing handshake
Any suggestions on troubleshooting these timeouts?
We're running 6.6.10..
-Preston
--
Preston Smith <psmith@xxxxxxxxxx>
Systems Research Engineer
Rosen Center for Advanced Computing, Purdue University