Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] 6.9.2 startup error
- Date: Thu, 24 May 2007 15:21:07 +0200
- From: Horvátth Szabolcs <szabolcs@xxxxxxxxxxxxx>
- Subject: Re: [Condor-users] 6.9.2 startup error
Hi Dan,
Yes, gadget is the scheduler and the log was produced by that machine.
I took a look at the negotiator's log to see some trace of this
communication problem and I found this:
5/24 14:52:47 ---------- Started Negotiation Cycle ----------
5/24 14:52:47 Phase 1: Obtaining ads from collector ...
...
5/24 14:52:47 Negotiating with szabolcs@xxxxxxxxxxxxxxxxxxx at
<192.168.0.50:3661>
5/24 14:52:47 0 seconds so far
5/24 14:52:47 condor_read(): recv() returned -1, errno = 10054, assuming
failure reading 5 bytes from <192.168.0.50:3661>.
5/24 14:52:47 Failed to get reply from schedd
5/24 14:52:47 Error: Ignoring schedd for this cycle
5/24 14:52:47 ---------- Finished Negotiation Cycle ----------
I guess if the negotiator can negotiate with the computer using the IP
192.168.0.50 than it had to connect with it somehow.
Than what might cause the problem when waiting for the reply?
Cheers,
Szabolcs
Is gadget.digicpictures.local the name of the host that this SchedLog
was produced on? If so, then this sounds to me like the schedd trying
to directly claim its "local" startd, because it hasn't successfully
communicated with the negotiator for a long time. How long is
controlled by SCHEDD_ASSUME_NEGOTIATOR_GONE, which defaults to 1200 seconds.
--Dan