It’s working! Thanks Kevan, Sónia Från:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] För
Wilding, Kevan A Hi,
Something along these lines: apt-get install condor firewall: - we only
open a range of ports 9600 to 9700 ufw allow 8614 ufw allow 8618 ufw allow
9600:9700/tcp ufw allow
9600:9700/udp /etc/condor_config.local----------------- CONDOR_ADMIN
= your email COLLECTOR_NAME
= your pool name #so that it will work
through firewall #usually these are
random HIGHPORT = 9700 LOWPORT = 9600 HOSTALLOW_READ =
$(FULL_HOSTNAME), your ip range HOSTALLOW_WRITE =
$(FULL_HOSTNAME), your ip range -------------------------------------------- /etc/init.d/condor
restart =================================================== ON Linux Client apt-get install condor firewall: ufw allow 8614 ufw allow 8618 ufw allow
9600:9700/tcp ufw allow
9600:9700/udp /etc/condor_config.local------------------------- CONDOR_ADMIN = your
email CONDOR_HOST = address
or ip of collector machine DAEMON_LIST = MASTER,
STARTD, SCHEDD #so that it will work
through firewall #usually these are
random HIGHPORT = 9700 LOWPORT = 9600 HOSTALLOW_READ =
$(FULL_HOSTNAME) your ip range HOSTALLOW_WRITE =
$(FULL_HOSTNAME) $(CONDOR_HOST) Kevan From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Sónia Liléo Hi Kevan, Thanks for your
answer. I downloaded the
condor 7.4.3 deb packet from Condors website but of course I should have done
apt-get install condor instead to get the appropriate version for Ubuntu. I have now installed
condor 7.2.4 from ubuntus repository and as you said the condor_config.local is
in the /etc directory. But I have noticed
that this file is empty. However is it correct
to do the configuration changes in the /etc/condor/condor_config file instead? Does condor look
first in the condor_config.local file and in case it doesn’t find the
configuration variables there it will look in the condor_config file? Regards, Sónia Från:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] För
Wilding, Kevan A Sónia,
You can install condor on Ubuntu using something like ‘sudo apt-get
install condor’ in an command window – it just works for us. Config files are
in the /etc directory Kevan From: condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx]
On Behalf Of Sónia Liléo Hi again! I have uninstalled
condor and installed it again (version condor 7.4.3 x86 64-LINUX_DEBIAN50) on
Ubuntu 4.4.3 platform. I get once again the
same problem. The directory
/var/run/condor does not exist (is not created) and therefore I get the error
message, 11/03 09:55:44 error
opening watchdog pipe /var/run/condor/procd_pipe.STARTD.watchdog: No such file
or directory (2) Furthermore the
daemon STARTD is not started automatically after rebooting. I have to start it
manually. Although STARTD is included in the DAEMON_LIST variable. root@noc-desktop:~#
condor_config_val -v DAEMON_LIST DAEMON_LIST: MASTER,
STARTD Defined in
'/etc/condor/condor_config.local', line 33. Has anyone used the
condor 7.4.3 x86 64-LINUX_DEBIAN50 version before on the ubuntu platform? How
did it work? Should I install
another condor version instead? As I mentioned
before, I have installed condor 7.4.3 x86 64-LINUX_DEBIAN50 on both Debian 2.6.32
and Debian 2.6.26-25lenny1 and it’s working fine. Thanks, /Sónia Från:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] För
Sónia Liléo Hello, I have installed
condor 7.4.3 x86 64-LINUX_DEBIAN50 on Ubuntu. However I have
noticed that there is no /var/run/condor directory. Therefore procd_pipe.STARTD.watchdog is not created. Why is the
/var/run/condor directory missing? I have tried to
uninstall condor in order to install it again but this it is not possible since
the directory /var/run/condor does not exist. FATAL: Required
directory /var/run/condor does not exist, or is not a directory. I have tried to
create this directory. Then the files procd_pipe.STARTD and
procd_pipe.STARTD.watchdog are created but there is no condor.pid. The StartLog
registers the following, 11/02 20:09:50 mkfifo
of /var/run/condor/procd_pipe.STARTD.2320.0 error: Permission denied (13) 11/02 20:09:50 failed
to initialize named pipe at /var/run/condor/procd_pipe.STARTD.2320.0 11/02 20:09:50
LocalClient: error initializing NamedPipeReader 11/02 20:09:50
ProcFamilyClient: failed to start connection with ProcD 11/02 20:09:50
register_subfamily: ProcD communication error 11/02 20:09:50
Create_Process: error registering family for pid 2926 11/02 20:09:50
Create_Process(/usr/sbin/condor_starter): child failed because it failed to
register itself with the ProcD 11/02 20:09:50 slot1:
ERROR: exec_starter failed! 11/02 20:09:50 slot1:
ERROR: exec_starter returned 0 What should I do? Which condor version
should be installed on Ubuntu platform? I have installed
condor 7.4.3 x86 64-LINUX_DEBIAN50 on both Debian 2.6.32 and Debian
2.6.26-25lenny1 and it’s working fine. Regards, Sónia Från:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] För
Sónia Liléo Hi! What does the following error message mean? 11/01 20:16:17 error opening watchdog
pipe /var/run/condor/procd_pipe.STARTD.watchdog: No such file or directory (2) StartLog, execute node 11/01 20:16:16 slot1: Request accepted. 11/01 20:16:16 slot1: Remote owner is o2f_sonlil@xxxxxxxxxxxxxxxxxxxx 11/01 20:16:16 slot1: State change:
claiming protocol successful 11/01 20:16:16 slot1: Changing state:
Matched -> Claimed 11/01 20:16:16 slot1: Started ClaimLease
timer (17) w/ 1800 second lease duration 11/01 20:16:17 slot1: Got activate_claim
request from shadow (<10.110.44.78:55118>) 11/01 20:16:17 slot1: Read request ad and
starter from shadow. 11/01 20:16:17 Swap space: 917496 11/01 20:16:17 13367628 kbytes available
for "/var/lib/condor/execute" 11/01 20:16:17 slot1: Total execute space:
13362508 11/01 20:16:17 13367628 kbytes available
for "/var/lib/condor/execute" 11/01 20:16:17 slot2: Total execute space:
13362508 11/01 20:16:17 slot1: Remote job ID is 116.0 11/01 20:16:17 slot1: Remote global job ID
is o2f-sth-lap-016.un.dr.dgcsystems.net#116.0#1288638373 11/01 20:16:17 slot1: JobLeaseDuration
defined in job ClassAd: 1200 11/01 20:16:17 slot1: Resetting ClaimLease
timer (17) with new duration 11/01 20:16:17 slot1: Sending Machine Ad to
Starter 11/01 20:16:17 slot1: About to
Create_Process "condor_starter -f -a slot1
o2f-sth-lap-014.un.dr.dgcsystems.net" 11/01 20:16:17 Create_Process: using fast
clone() to create child process. 11/01 20:16:17 error opening watchdog pipe
/var/run/condor/procd_pipe.STARTD.watchdog: No such file or directory (2) 11/01 20:16:17 ProcFamilyClient: error
initializing LocalClient 11/01 20:16:17 ProcFamilyProxy: error
initializing ProcFamilyClient 11/01 20:16:17 ERROR "ProcD has
failed" at line 599 in file proc_family_proxy.cpp 11/01 20:16:17 CronMgr: 0 jobs alive 11/01 20:16:17 slot1: Canceled ClaimLease
timer (17) 11/01 20:16:17 slot1: Changing state and
activity: Claimed/Idle -> Preempting/Killing 11/01 20:16:17 Entered vacate_client
<10.110.44.79:53584> o2f-sth-lap-014.un.dr.dgcsystems.net... 11/01 20:16:17 slot1: State change: No
preempting claim, returning to owner 11/01 20:16:17 slot1: Changing state and
activity: Preempting/Killing -> Owner/Idle 11/01 20:16:17 slot1: State change:
IS_OWNER is false 11/01 20:16:17 slot1: Changing state: Owner
-> Unclaimed 11/01 20:16:17 startd exiting because of
fatal exception. StarterLog, execute node 11/01 20:20:30 Reading from /proc/cpuinfo 11/01 20:20:30 Found: Physical-IDs:False;
Core-IDs:False 11/01 20:20:30 Using processor count: 2
processors, 2 CPUs, 0 HTs 11/01 20:20:30 Reading condor configuration
from '/etc/condor/condor_config' If I do condor_restart –startd at this
execute node, I get Can't connect to local startd Is this “Found: Physical-IDs:False;
Core-IDs:False” a problem? Regards, Sónia Sónia Liléo |