Re: [Condor-users] No collector -- no connection to 9618


Date: Mon, 07 Feb 2005 16:58:21 -0800
From: Michael Hannon <jmh@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Condor-users] No collector -- no connection to 9618
Alain Roy wrote:

You say that they are not running, but the errors you show indicate that you cannot connect to them. So are they running?
If you look at all of the processes on the central manager (where they should be running) do you see the collector and negotiator running? Are there any error messages in the log files that they produce?


They are not running:


OK, what is DAEMON_LIST defined to be on that host? It should list the collector and negotiator.

1) If they aren't listed, add them. Just add COLLECTOR and NEGOTIATOR to the list, separated with commas. Given that you already have the master, startd, and schedd, it would look like this:

DAEMON_LIST = MASTER, COLLECTOR, NEGOTIATOR, STARTD, SCHEDD

2) If they are listed but you don't see them running, then they crashed. Look for a CollectorLog and a NegotiatorLog and see if they tell us anything good. Also look for core files in the log directory.

Indeed, as you implied, those daemons were NOT in the DAEMON_LIST. After I added them to the list and restarted condor, the collector and the negotiator both show up in the output of "ps aux", and condor_status works as expected. Now on to the real problems. Thanks.


					- Mike
--
Michael Hannon            mailto:hannon@xxxxxxxxxxxxxxxxxxx
Dept. of Physics          530.752.4966
University of California  530.752.4717 FAX
Davis, CA 95616-8677


[← Prev in Thread] Current Thread [Next in Thread→]