Dear all,
I recently installed condor 8.3.2 via dpkg in a clean Ubuntu 14.04 OS. I added the lines:
use feature : GPUs GPU_DISCOVERY_EXTRA = -extra
into the condor_config.local file located in /etc/condor.
Problems arise when I start condor via sudo condor service start. All of the daemons on the DAEMONS_LIST in the local file start except for STARTD.
Here is the StarterLog. I'm not sure how to fix this.
01/06/15 19:04:02 ****************************************************** 01/06/15 19:04:02 ** condor_startd (CONDOR_STARTD) STARTING UP 01/06/15 19:04:02 ** /usr/sbin/condor_startd 01/06/15 19:04:02 ** SubsystemInfo: name=STARTD type=STARTD(7) class=DAEMON(1) 01/06/15 19:04:02 ** Configuration: subsystem:STARTD local:<NONE> class:DAEMON 01/06/15 19:04:02 ** $CondorVersion: 8.3.2 Dec 16 2014 BuildID: 288596 $ 01/06/15 19:04:02 ** $CondorPlatform: x86_64_Ubuntu14 $ 01/06/15 19:04:02 ** PID = 16066 01/06/15 19:04:02 ** Log last touched 1/6 18:59:37 01/06/15 19:04:02 ****************************************************** 01/06/15 19:04:02 Using config source: /etc/condor/condor_config 01/06/15 19:04:02 Using local config sources: 01/06/15 19:04:02 /etc/condor/condor_config.local 01/06/15 19:04:02 config Macros = 88, Sorted = 88, StringBytes = 2901, TablesBytes = 3216 01/06/15 19:04:02 CLASSAD_CACHING is ENABLED 01/06/15 19:04:02 Daemon Log is logging: D_ALWAYS D_ERROR 01/06/15 19:04:02 Daemoncore: Listening at <0.0.0.0:43738> on TCP (ReliSock) and UDP (SafeSock). 01/06/15 19:04:02 DaemonCore: command socket at <192.168.6.108:43738> 01/06/15 19:04:02 DaemonCore: private command socket at <192.168.6.108:43738> 01/06/15 19:04:02 my_popenv failed 01/06/15 19:04:02 Failed to run hibernation plugin '/usr/libexec/condor_power_state ad' 01/06/15 19:04:02 VM-gahp server reported an internal error 01/06/15 19:04:02 VM universe will be tested to check if it is available 01/06/15 19:04:02 History file rotation is enabled. 01/06/15 19:04:02 Maximum history file size is: 20971520 bytes 01/06/15 19:04:02 Number of rotated history files is: 2 01/06/15 19:04:02 ERROR "Failed to execute local resource 'GPUs' inventory script "/usr/libexec/condor_gpu_discovery -properties -extra"" at line 625 in file /slots/01/dir_53959/userdir/src/condor_startd.V6/ResAttributes.cpp
the condor_gpu_discovery script is located in /usr/lib/condor/libexec/ not in /usr/libexec. What variable do I need to set for condor to find this file in it's correct location? The relevant variables from the global config file are as follows:
##-------------------------------------------------------------------- ## Pathnames: ##-------------------------------------------------------------------- ## Where have you installed the bin, sbin and lib condor directories? RELEASE_DIR = /usr
## Where is the local condor directory for each host? ## This is where the local config file(s), logs and ## spool/execute directories are located LOCAL_DIR = /var/condor #LOCAL_DIR = $(RELEASE_DIR)/hosts/$(HOSTNAME)
## Where is the machine-specific local config file for each host? CONFIG_DIR = /etc/condor LOCAL_CONFIG_FILE = $(CONFIG_DIR)/condor_config.local
## Where are optional machine-specific local config files located? ## Config files are included in lexicographic order. LOCAL_CONFIG_DIR = $(LOCAL_DIR)/config
## Blacklist for file processing in the LOCAL_CONFIG_DIR ## LOCAL_CONFIG_DIR_EXCLUDE_REGEXP = ^((\..*)|(.*~)|(#.*)|(.*\.rpmsave)|(.*\.rpmnew))$
## If the local config file is not present, is it an error? ## WARNING: This is a potential security issue. ## If not specified, the default is True #REQUIRE_LOCAL_CONFIG_FILE = TRUE
Any help would be greatly appreciated
Michael McInerny Murphy Engineer IERUS Technologies, Inc. 2904 Westcorp Blvd., Suite 210 (256) 319-2026 x 107 |