Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [condor-users] Troubleshooting process
- Date: Tue, 02 Mar 2004 19:00:43 -0500 (EST)
- From: kge2@xxxxxxxx
- Subject: Re: [condor-users] Troubleshooting process
condor_status on a client returns nothing. The clients only have condor_master
running and nothing else. condor_startd keeps exiting with result '4'.
condor_config_val returned "MASTER,STARTD". The demons are trying to connect to
the server at 172.16.0.1. This is correct.
Maybe I have the firewall misconfigured. I thought I just disabled everything
firewall related, as this is a totally isolated network.
Another error I've found is that the file "condor_starter.pvm" is missing.
Where can I find that?
Quoting Jaime Frey <jfrey@xxxxxxxxxxx>:
> On Mon, 1 Mar 2004 kge2@xxxxxxxx wrote:
> * Run condor_status and see if the hostname appears in the output. If
> the
> hostname doesn't appear, then Condor isn't aware of it as an execute
> node.
>
> * On the machine, run ps and look for any process named condor_startd.
> condor_startd is the daemon that makes a machine an execute node.
>
> * On the machine, run condor_config_val -master DAEMON_LIST and see if
> "STARTD" appears in the results. This will tell you if Condor is
> configured to run the condor_startd daemon. You can also look in the
> config file (which is where you'd change DAEMON_LIST if STARTD is
> listed).
>
> * On the machine, look for a file StartLog in the Condor log directory.
> If
> it's present and has recent entries in it, the condor_startd is
> probably
> running.
>
> As for what interface the Condor daemons are using, every Condor
> daemon
> writes something like the following to its log when it starts:
>
> 12/31 16:58:38 ******************************************************
> 12/31 16:58:38 ** condor_master (CONDOR_MASTER) STARTING UP
> 12/31 16:58:38 ** $CondorVersion: 6.6.1 Dec 30 2003
> RH9-BRANCH-PRE-RELEASE $
> 12/31 16:58:38 ** $CondorPlatform: I386-LINUX-RH9 $
> 12/31 16:58:38 ** PID = 7125
> 12/31 16:58:38 ******************************************************
> 12/31 16:58:38 Using config file: /some/path/name
> 12/31 16:58:38 Using local config files: /some/other/path/name
> 12/31 16:58:38 DaemonCore: Command Socket at <128.105.111.110:32873>
>
> That last line tells you the ip:port the daemon is listening on. All
> outgoing connections will be made on the same network interface. If it
> reads 127.0.0.1, you're going to have problems. :-)
>
> +------------------------------------+-------------------------------+
> | Jaime Frey |There are 10 types of people in|
> | jfrey@xxxxxxxxxxx |the world: Those who understand|
> | http://www.cs.wisc.edu/~jfrey/ | binary, and those who don't |
> +------------------------------------+-------------------------------+
> Condor Support Information:
> http://www.cs.wisc.edu/condor/condor-support/
> To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
> unsubscribe condor-users <your_email_address>
>
>
Condor Support Information:
http://www.cs.wisc.edu/condor/condor-support/
To Unsubscribe, send mail to majordomo@xxxxxxxxxxx with
unsubscribe condor-users <your_email_address>