On an execute node, I can run
condor_master no problem from the command line, but my init script
condor.boot generates an error. Below is a trace. # shows that CONDOR_CONFIG is set and points to a file which exists and is not empty [root@mackenzie condor]# ls -Fla $CONDOR_CONFIG -rw-r--r-- 1 root root 93644 Mar 20 2008 /se/app/shared/condor-7.0.1/etc/condor_config # shows failed startup script [root@mackenzie condor]# service condor start Starting up Condor Neither the environment variable CONDOR_CONFIG, /etc/condor/, nor ~condor/ contain a condor_config source. Either set CONDOR_CONFIG to point to a valid config source, or put a "condor_config" file in /etc/condor or ~condor/ Exiting. # shows that condor_master from the command line works [root@mackenzie sbin]# ./condor_master [root@nahanni sbin]# ps -ef | grep condor condor 5990 1 0 17:38 ? 00:00:00 ./condor_master condor 5991 5990 82 17:38 ? 00:00:02 condor_startd -f On the "head" node, when I run condor_status I get an error that the collector cannot be found, even though it is running. [root@abitibi sbin]# condor_status Error: Could not fetch ads --- can't find collector [root@abitibi sbin]# ps -ef | grep condor condor 28500 1 1 17:45 ? 00:00:00 ./condor_master condor 28501 28500 0 17:45 ? 00:00:00 condor_collector -f condor 28503 28500 1 17:45 ? 00:00:00 condor_negotiator -f condor 28504 28500 1 17:45 ? 00:00:00 condor_schedd -f condor 28505 28500 86 17:45 ? 00:00:01 condor_startd -f root 28506 28504 1 17:45 ? 00:00:00 condor_procd -A /tmp/condor-lock.abitibi0.0513363986547155/procd_pipe.SCHEDD -S 60 -C 9422 Any hints as to what might be going wrong would be greatly appreciated. It seems like very strange behavior. -- Ian Stokes-Rees W: http://sbgrid.org ijstokes@xxxxxxxxxxxxxxxxxxx T: +1 617 418-4168 SBGrid, Harvard Medical School F: +1 617 432-5600 |