Ben, Truly the directory did not exist. I have created it. CONDOR_ADMIN is root@localhost. labounek@magellan:/var/run$ sudo mkdir condor Root owns that folder. After ls- l drwxr-xr-x 2 root root 80 úno 26 16:35 condor Inside the condor folder, it looks like this after sudo condor_master. And the condor_master (under user condor) and condor_procd (under user root) are running at magellan: labounek@magellan:/var/run/condor$ ls -l celkem 0 prw------- 1 condor root 0 úno 26 16:40 procd_pipe prw------- 1 condor root 0 úno 26 16:40 procd_pipe.watchdog labounek@magellan:/var/run/condor$ But still, the condor_status see only 12 emperor's cores. I suppose because the condor_startd is not still running. Here is the new MasterLog. And the file /var/lock/condor/InstanceLock is at magellan. labounek@magellan:/var/lock/condor$ ls -l InstanceLock -rw------- 1 condor condor 0 úno 26 16:43 InstanceLock labounek@magellan:/var/lock/condor$ Regards, Rene 02/26/16 16:43:31 ****************************************************** 02/26/16 16:43:31 ** condor_master (CONDOR_MASTER) STARTING UP 02/26/16 16:43:31 ** /usr/sbin/condor_master 02/26/16 16:43:31 ** SubsystemInfo: name=MASTER type=MASTER(2) class=DAEMON(1) 02/26/16 16:43:31 ** Configuration: subsystem:MASTER local:<NONE> class:DAEMON 02/26/16 16:43:31 ** $CondorVersion: 8.4.0 Sep 23 2015 BuildID: Debian-8.4.0~dfsg.1-1~nd80+1 Debian-8.4.0~dfsg.1-1~nd80+1 $ 02/26/16 16:43:31 ** $CondorPlatform: X86_64-Debian_8 $ 02/26/16 16:43:31 ** PID = 4791 02/26/16 16:43:31 ** Log last touched 2/26 16:43:19 02/26/16 16:43:31 ****************************************************** 02/26/16 16:43:31 Using config source: /etc/condor/condor_config 02/26/16 16:43:31 Using local config sources: 02/26/16 16:43:31 /etc/condor/config.d/00debconf 02/26/16 16:43:31 /etc/condor/condor_config.local 02/26/16 16:43:31 config Macros = 62, Sorted = 62, StringBytes = 1664, TablesBytes = 2288 02/26/16 16:43:31 CLASSAD_CACHING is OFF 02/26/16 16:43:31 Daemon Log is logging: D_ALWAYS D_ERROR 02/26/16 16:43:31 lock_file returning ERROR, errno=11 (Resource temporarily unavailable) 02/26/16 16:43:31 FileLock::obtain(1) failed - errno 11 (Resource temporarily unavailable) 02/26/16 16:43:31 ERROR "Can't get lock on "/var/lock/condor/InstanceLock"" at line 1106 in file /tmp/buildd/condor-8.4.0~dfsg.1/src/condor_master.V6/master.cpp 02/26/16 16:45:31 mkfifo of /var/run/condor/procd_pipe.4744.0 error: Permission denied (13) 02/26/16 16:45:31 failed to initialize named pipe at /var/run/condor/procd_pipe.4744.0 02/26/16 16:45:31 LocalClient: error initializing NamedPipeReader 02/26/16 16:45:31 ProcFamilyClient: failed to start connection with ProcD 02/26/16 16:45:31 register_subfamily: ProcD communication error 02/26/16 16:45:31 Create_Process: error registering family for pid 4808 02/26/16 16:45:31 Create_Process(/usr/sbin/condor_startd): child failed because it failed to register itself with the ProcD 02/26/16 16:45:31 ERROR: Create_Process failed trying to start /usr/sbin/condor_startd 02/26/16 16:45:31 restarting /usr/sbin/condor_startd in 521 seconds Dne 26.2.2016 v 16:20 Ben Cotton
napsal(a):
Rene, I think this is the important line:02/26/16 14:37:17 error opening watchdog pipe /var/run/condor/procd_pipe.watchdog: No such file or directory (2)Does the /var/run/condor/ exist and is it writable by the condor user? If not, try creating that directory and see if HTCondor will start. I'm not as familiar with Debian systems, but I know on RHEL7, that directory is created by systemd at boot time. Thanks, BC |