[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_status does not show working nodes...



Under Debian, on all nodes, make sure that /etc/hostname and /etc/hosts have the same fully qualified domain names, eg /etc/hosts is just

 

127.0.0.1ÂÂÂÂÂÂÂÂÂÂÂÂÂ localhost

127.0.1.1ÂÂÂÂÂÂÂÂÂÂÂÂÂ workernode1.yourdomain

<ip addr>ÂÂÂÂÂÂÂÂÂÂÂÂ condor-master.yourdomain

 

On condor-master, /etc/hosts should be:

 

127.0.0.1ÂÂÂÂÂÂÂÂÂÂÂÂÂ localhost

<ip addr>ÂÂÂÂÂÂÂÂÂÂÂÂ condor-master.yourdomain

<ip addr>ÂÂÂÂÂÂÂÂÂÂÂÂ workernode1.yourdomain

<ip addr>ÂÂÂÂÂÂÂÂÂÂÂÂ workernode2.yourdomain

Etc

 

And after mucking around, âsudo service condor restartâ the master first.

 

If condor_status shows blanks after repreated service restart, probably the hosts and hostname arenât the same.

 

Dan

 

From: HTCondor-users [mailto:htcondor-users-bounces@xxxxxxxxxxx] On Behalf Of john alexander sanabria ordonez
Sent: Tuesday, April 29, 2014 3:40 PM
To: htcondor-users@xxxxxxxxxxx
Subject: [HTCondor-users] condor_status does not show working nodes...

 

hi,

I just installed a cluster having two working nodes but condor_status does not show none of them.

In the master node the following processes are running:

[vagrant@condor-master ~]$ ps ax | grep condor
24355 ?        Ss     0:00 /opt/condor806/sbin/condor_master
24356 ?        S      0:00 condor_procd -A /tmp/condor-lock.0.178442601421015/procd_pipe -L /opt/condor806/local.condor-master/log/ProcLog -R 10000000 -S 60 -C 503
24357 ?        Ss     0:00 condor_collector -f
24358 ?        Ss     0:00 condor_negotiator -f
24359 ?        Ss     0:00 condor_schedd -f

and one working node is running the following processes:

[vagrant@wn-02 ~]$ ps ax | grep condor
24334 ?        Ss     0:00 /opt/condor806/sbin/condor_master
24335 ?        S      0:00 condor_procd -A /tmp/condor-lock.0.403517255167575/procd_pipe -L /opt/condor806/local.wn-02/log/ProcLog -R 10000000 -S 60 -C 503
24336 ?        Ss     0:00 condor_startd -f

Any ideas what is going wrong?

Thanks,