Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Problems Re-adding a Condor Client
- Date: Thu, 11 Sep 2003 10:49:20 -0500
- From: Erik Paulson <epaulson@xxxxxxxxxxx>
- Subject: Re: [condor-users] Problems Re-adding a Condor Client
On Thu, Sep 11, 2003 at 09:45:31AM -0400, Jess Cannata wrote:
> I have a 48-node Linux cluster running Condor. One of the node's hard
> drive crashed and was rebuilt. I have tried, unsuccessfully, to get
> Condor running again on the rebuilt node (it worked fine before the node
> crashed, and it works fine for the other 47 machines). The Condor base
> install is on /home/condor, which is shared across all of the nodes via
> NFS. The condor user exists on the new node. All I did was create
> /var/lock/condor/InstanceLock with the proper permissions and make it so
> the Condor services would start.
>
> The Condor services start on the rebuilt node without any errors, but
> the Condor master never sees the new node (condor_status doesn't report
> the rebuilt node). It is as if no information were being sent from the
> rebuilt node to the master node. However, I know that the network
> communication is fine (the new node is loading the Condor services off
> the NFS mount).
>
> The following services are running on the rebuilt node:
>
> condor_master
> condor_startd
> condor_schedd
>
> and their log files show no errors.
>
> Has anyone seen this problem before? How exactly do the Condor clients
> communicate to the master node? Is it via a specific TCP/UDP port?
Yes - The Condor central manager listens on port 9614 for updates. The
rest of Condor knows which machine this is from the COLLECTOR_HOST setting
in the config file (which, by default, is set to be the value of
CONDOR_HOST)
> I've
> disabled IPTABLES on both the master and the client to no avail. The
> weird thing is that all of the other clients are showing up.
>
> Any help would be appreciated.
>
If everything looks OK on the missing node, the next place to check is the
CollectorLog file on the central manager.
-Erik
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>