Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Problem installing condor on cluster.

Date: Wed, 21 Sep 2005 11:04:05 +0530
From: Prashant Lal <lalp@xxxxxxxxxxx>
Subject: Re: [Condor-users] Problem installing condor on cluster.

do
condor_config_val LOCK

Is it similar to this:
LOCK = /var/lock/condor

If not then search LOCK in global config file make an entry like this

and then create the directory

mkdir -p /var/lock/condor
chmod 777 /var/lock/condor

Yoou have to create the directory on every node.

On Tue, 2005-09-20 at 21:01 +0100, Chris Miles wrote:

I am still having the same problem.

here is a part listing of the condor_config

/home/condor/condor_config
CONDOR_HOST  = thebeast

## Where have you installed the bin, sbin and lib condor directories?
RELEASE_DIR = /home/condor/release

## Where is the local condor directory for each host?
LOCAL_DIR  = /home/condor/hosts/$(HOSTNAME)

## Where is the machine-specific local config file for each host?
LOCAL_CONFIG_FILE = /home/condor/release/etc/$(HOSTNAME).local

------

so in respect of $(HOSTNAME).local I have edited the following file for my central manager.

/home/condor/release/etc/thebeast.local

DAEMON_LIST   = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD

Now the hostname environmental variable looks up fine.

thebeast:/home/condor # hostname
thebeast
thebeast:/home/condor # echo $HOSTNAME
thebeast
thebeast:/home/condor #

So I could assume i could run the condor daemons on the central manager machine successfully.

thebeast:/home/condor # condor_master
thebeast:/home/condor # ps -fe | grep condor
condor    5272     1 0 16:34 ?        00:00:00 condor_master
condor    5273 5272 0 16:34 ?        00:00:00 condor_schedd -f -n root@xxxxxxxxxxxxxxxxxxx

This is all im getting.. And running the collector and negotiator manually doesnt make any difference.
They are running in memory but no node can connect. condor_status on any node returns error that
the collector on thebeast can not be contacted even after I ran them manually.
Each node can ping thebeast fine and the host read and write are set properly.

any ideas?

thanks

Chris

----- Original Message -----
From: Chris Miles
To: Condor-Users Mail List
Sent: Tuesday, September 20, 2005 12:19 AM
Subject: [Condor-users] Problem installing condor on cluster.

I have a cluster with 24 nodes and a manager node.

names of each are

mgmnt.cluster.int
node1.cluster.int
node2.cluster.int
..
node24.cluster.int

mgmnt.cluster.int has an alias thebeast.cluster.int (it was setup this way by the company that installed the cluster).
so when i log into the manager i get the prompt [thebeast] but if i ping thebeast it starts pinging mgmnt.cluster.int (192.168.1.1).

on the cluster /home/condor is shared between all nodes.

I ran condor_install and setup the various options, selecting that /home/condor was shared and the relevant options with config files etc.

During condor_install i set the condor central manager to mgmnt.cluster.int

The cluster manager machine is also my condor central manager.

I then ran condor_init and then condor_master but the collector and a few other processes associated with the central manager
is not running so I can only presume than when im executing condor_master on this machine its only recognising itself as a normal
node and not the central manager which I want it to be.

Is there something simple I am missing or overseeing?

Many thanks in advance

Chris Miles
University Of Paisley, Scotland

_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users

--
Prashant Lal <lalp@xxxxxxxxxxx>
Cadence Design Systems

References:
- [Condor-users] Problem installing condor on cluster.
  - From: Chris Miles
- Re: [Condor-users] Problem installing condor on cluster.
  - From: Chris Miles

Prev by Date: Re: [Condor-users] condor on RHEL 3 for AMD64
Next by Date: Re: [Condor-users] (no subject)
Previous by thread: Re: [Condor-users] Problem installing condor on cluster.
Next by thread: Re: [Condor-users] Problem installing condor on cluster.
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [Condor-users] Problem installing condor on cluster.