Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Issue with connecting nodes to pool/master
- Date: Tue, 27 Jun 2006 17:31:31 -0500
- From: Erik Paulson <epaulson@xxxxxxxxxxx>
- Subject: Re: [Condor-users] Issue with connecting nodes to pool/master
On Tue, Jun 27, 2006 at 05:44:02PM -0400, Robert Wright wrote:
> > First - some terminology - the "master" is a program, which runs on
> > every machine. In a condor pool, there is one machine called the
> > "central manager", which runs a condor_collector and condor_negotiator.
> > You probably mean for you 192.168.1.102 to be your central manager.
> 1.102 is the central manager
>
> > What machine is the log file below from? You should only have
> > a NegotiatorLog on one machine, the central manager.
> 1.101.
>
Well, there's problem 1. Turn off your negotiator on 101. You should only
have a negotiator on 102.
> > Errno 113 is "No Route To Host". Do you have your networking properly
> > configured (ie can you ping your central manager from all your other
> > machines?)
> all ICMP, UDP, TCP traffic is passing properly... You name the service i
> am able to transfer traffic. 21/22/23/80 etc
>
9618 :)
> > The interesting logfiles are the CollectorLog from your central
> > manager, and a StartdLog file from an execute node.
>
> StartdLog on node0 (execute)
> 6/21 09:24:07 ******************************************************
> 6/21 09:24:07 ** condor_startd (CONDOR_STARTD) STARTING UP
> 6/21 09:24:07 ** /usr/local/condor/sbin/condor_startd
> 6/21 09:24:07 ** $CondorVersion: 6.6.11 Mar 23 2006 $
> 6/21 09:24:07 ** $CondorPlatform: I386-LINUX_RH9 $
> 6/21 09:24:07 ** PID = 19916
> 6/21 09:24:07 ******************************************************
> 6/21 09:24:07 Using config file: /usr/local/condor/etc/condor_config
> 6/21 09:24:07 Using local config files:
> /usr/local/condor/local.node0/condor_config.local
> 6/21 09:24:07 DaemonCore: Command Socket at <192.168.1.101:46451>
> 6/21 09:24:14 ERROR "Required attribute "START" is not defined" at line
> 255 in file util.C
>
There's another problem. Make sure that you have a
START = <some valid expression> in either
/usr/local/condor/etc/condor_config
or
/usr/local/condor/local.node0/condor_config.local
-Erik