Hi, Something along these lines: apt-get install condor firewall: - we only open a range of ports 9600 to 9700 ufw allow 8614 ufw allow 8618 ufw allow 9600:9700/tcp ufw allow 9600:9700/udp /etc/condor_config.local----------------- CONDOR_ADMIN = your email COLLECTOR_NAME = your pool name #so that it will work through firewall #usually these are random HIGHPORT = 9700 LOWPORT = 9600 HOSTALLOW_READ = $(FULL_HOSTNAME), your ip range HOSTALLOW_WRITE = $(FULL_HOSTNAME), your ip range -------------------------------------------- /etc/init.d/condor restart =================================================== ON Linux Client apt-get install condor firewall: ufw allow 8614 ufw allow 8618 ufw allow 9600:9700/tcp ufw allow 9600:9700/udp /etc/condor_config.local------------------------- CONDOR_ADMIN = your email CONDOR_HOST = address or ip of collector machine DAEMON_LIST = MASTER, STARTD, SCHEDD #so that it will work through firewall #usually these are random HIGHPORT = 9700 LOWPORT = 9600 HOSTALLOW_READ = $(FULL_HOSTNAME) your ip range HOSTALLOW_WRITE = $(FULL_HOSTNAME) $(CONDOR_HOST) Kevan From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
On Behalf Of Sónia Liléo Hi Kevan, Thanks for your answer. I downloaded the condor 7.4.3 deb packet from Condors website but of course I should have done apt-get install condor instead to get the appropriate version for Ubuntu. I have now installed condor 7.2.4 from ubuntus repository and as you said the condor_config.local is in the /etc directory. But I have noticed that this file is empty. However is it correct to do the configuration changes in the /etc/condor/condor_config file instead?
Does condor look first in the condor_config.local file and in case it doesn’t find the configuration variables there it will look in the condor_config file? Regards, Sónia Från: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
För Wilding, Kevan A Sónia, You can install condor on Ubuntu using something like ‘sudo apt-get install condor’ in an command window – it just works for us. Config files are in the /etc directory Kevan From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
On Behalf Of Sónia Liléo Hi again! I have uninstalled condor and installed it again (version condor 7.4.3 x86 64-LINUX_DEBIAN50) on Ubuntu 4.4.3 platform. I get once again the same problem. The directory /var/run/condor does not exist (is not created) and therefore I get the error message, 11/03 09:55:44 error opening watchdog pipe /var/run/condor/procd_pipe.STARTD.watchdog: No such file or directory (2) Furthermore the daemon STARTD is not started automatically after rebooting. I have to start it manually. Although STARTD is included in the DAEMON_LIST variable.
root@noc-desktop:~# condor_config_val -v DAEMON_LIST DAEMON_LIST: MASTER, STARTD Defined in '/etc/condor/condor_config.local', line 33. Has anyone used the condor 7.4.3 x86 64-LINUX_DEBIAN50 version before on the ubuntu platform? How did it work? Should I install another condor version instead? As I mentioned before, I have installed condor 7.4.3 x86 64-LINUX_DEBIAN50 on both Debian 2.6.32 and Debian 2.6.26-25lenny1 and it’s working fine. Thanks, /Sónia Från: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
För Sónia Liléo Hello, I have installed condor 7.4.3 x86 64-LINUX_DEBIAN50 on Ubuntu.
However I have noticed that there is no /var/run/condor directory. Therefore procd_pipe.STARTD.watchdog
is not created. Why is the /var/run/condor directory missing? I have tried to uninstall condor in order to install it again but this it is not possible since the directory /var/run/condor does not exist.
FATAL: Required directory /var/run/condor does not exist, or is not a directory. I have tried to create this directory. Then the files procd_pipe.STARTD and procd_pipe.STARTD.watchdog are created but there is no condor.pid. The StartLog registers the following,
11/02 20:09:50 mkfifo of /var/run/condor/procd_pipe.STARTD.2320.0 error: Permission denied (13) 11/02 20:09:50 failed to initialize named pipe at /var/run/condor/procd_pipe.STARTD.2320.0 11/02 20:09:50 LocalClient: error initializing NamedPipeReader 11/02 20:09:50 ProcFamilyClient: failed to start connection with ProcD 11/02 20:09:50 register_subfamily: ProcD communication error 11/02 20:09:50 Create_Process: error registering family for pid 2926 11/02 20:09:50 Create_Process(/usr/sbin/condor_starter): child failed because it failed to register itself with the ProcD 11/02 20:09:50 slot1: ERROR: exec_starter failed! 11/02 20:09:50 slot1: ERROR: exec_starter returned 0 What should I do? Which condor version should be installed on Ubuntu platform? I have installed condor 7.4.3 x86 64-LINUX_DEBIAN50 on both Debian 2.6.32 and Debian 2.6.26-25lenny1 and it’s working fine. Regards, Sónia Från: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
För Sónia Liléo Hi! What does the following error message mean? 11/01 20:16:17 error opening watchdog pipe /var/run/condor/procd_pipe.STARTD.watchdog: No such file or directory (2) StartLog, execute node 11/01 20:16:16 slot1: Request accepted. 11/01 20:16:16 slot1: Remote owner is
o2f_sonlil@xxxxxxxxxxxxxxxxxxxx 11/01 20:16:16 slot1: State change: claiming protocol successful 11/01 20:16:16 slot1: Changing state: Matched -> Claimed 11/01 20:16:16 slot1: Started ClaimLease timer (17) w/ 1800 second lease duration 11/01 20:16:17 slot1: Got activate_claim request from shadow (<10.110.44.78:55118>) 11/01 20:16:17 slot1: Read request ad and starter from shadow. 11/01 20:16:17 Swap space: 917496 11/01 20:16:17 13367628 kbytes available for "/var/lib/condor/execute" 11/01 20:16:17 slot1: Total execute space: 13362508 11/01 20:16:17 13367628 kbytes available for "/var/lib/condor/execute" 11/01 20:16:17 slot2: Total execute space: 13362508 11/01 20:16:17 slot1: Remote job ID is 116.0 11/01 20:16:17 slot1: Remote global job ID is o2f-sth-lap-016.un.dr.dgcsystems.net#116.0#1288638373 11/01 20:16:17 slot1: JobLeaseDuration defined in job ClassAd: 1200 11/01 20:16:17 slot1: Resetting ClaimLease timer (17) with new duration 11/01 20:16:17 slot1: Sending Machine Ad to Starter 11/01 20:16:17 slot1: About to Create_Process "condor_starter -f -a slot1 o2f-sth-lap-014.un.dr.dgcsystems.net" 11/01 20:16:17 Create_Process: using fast clone() to create child process. 11/01 20:16:17 error opening watchdog pipe /var/run/condor/procd_pipe.STARTD.watchdog: No such file or directory (2) 11/01 20:16:17 ProcFamilyClient: error initializing LocalClient 11/01 20:16:17 ProcFamilyProxy: error initializing ProcFamilyClient 11/01 20:16:17 ERROR "ProcD has failed" at line 599 in file proc_family_proxy.cpp 11/01 20:16:17 CronMgr: 0 jobs alive 11/01 20:16:17 slot1: Canceled ClaimLease timer (17) 11/01 20:16:17 slot1: Changing state and activity: Claimed/Idle -> Preempting/Killing 11/01 20:16:17 Entered vacate_client <10.110.44.79:53584> o2f-sth-lap-014.un.dr.dgcsystems.net... 11/01 20:16:17 slot1: State change: No preempting claim, returning to owner 11/01 20:16:17 slot1: Changing state and activity: Preempting/Killing -> Owner/Idle 11/01 20:16:17 slot1: State change: IS_OWNER is false 11/01 20:16:17 slot1: Changing state: Owner -> Unclaimed 11/01 20:16:17 startd exiting because of fatal exception. StarterLog, execute node 11/01 20:20:30 Reading from /proc/cpuinfo 11/01 20:20:30 Found: Physical-IDs:False; Core-IDs:False 11/01 20:20:30 Using processor count: 2 processors, 2 CPUs, 0 HTs 11/01 20:20:30 Reading condor configuration from '/etc/condor/condor_config' If I do condor_restart –startd at this execute node, I get Can't connect to local startd Is this “Found: Physical-IDs:False; Core-IDs:False” a problem? Regards, Sónia Sónia Liléo |