Dear all, We have updated from stable condor version 8.2.3 to the development version 8.3.5 (installing the new condor-all rpm). We were submitting jobs from an ARC-CE to our condor test environment in 8.2.3 version without any important issue. However, we are facing problems with version 8.3.5. It seems that condor is not finding the libraries placed in /usr/lib64/condor although it is defined as the LIB directory in our condor_config file: # cat condor_config | egrep "RELEASE|LIB" | grep -v ^# RELEASE_DIR = /usr BIN = $(RELEASE_DIR)/bin LIB = $(RELEASE_DIR)/lib64/condor INCLUDE = $(RELEASE_DIR)/include/condor SBIN = $(RELEASE_DIR)/sbin LIBEXEC = $(RELEASE_DIR)/libexec/condor SHARE = $(RELEASE_DIR)/share/condor In the ShadowLog of the schedd, we can see: 06/23/15 15:48:44 (228.0) (24598): Request to run on slot1@xxxxxxxxxxxx <192.168.101.5:41149> was ACCEPTED 06/23/15 15:48:44 (225.0) (24572): ERROR "Error from slot1@xxxxxxxxxxxx: Failed to transfer files" at line 562 in file /slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_shadow.V6.1/pseudo_ops.cpp 06/23/15 15:48:44 (225.0) (24572): ReliSock::put_x509_delegation(): delegation failed: x509_send_delegation failed at line 1422 06/23/15 15:48:45 (225.0) (24572): DoUpload: SHADOW at 193.109.175.11 failed to send file(s) to <192.168.101.5:36996>: error sending /var/spool/arc/jobstatus/job.EOXNDm5ccRmnl6QwDoplpaQmABFKDmABFKDm3hIKDmABFKDm86o53m.proxy 06/23/15 15:48:45 (228.0) (24598): ERROR "Error from slot1@xxxxxxxxxxxx: Failed to transfer files" at line 562 in file /slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_shadow.V6.1/pseudo_ops.cpp These are the errors in the StarterLog of the execute machine: 06/23/15 15:53:00 (pid:9435) ReliSock::get_x509_delegation(): delegation failed: Failed to open GSI libraries: libglobus_common.so.0: cannot open shared object file: No such file or directory 06/23/15 15:53:00 (pid:9435) DoDownload: STARTER at 192.168.101.5 failed to receive file /home/execute/dir_9432/job.IvCMDm9mdRmnl6QwDoplpaQmABFKDmABFKDmIwLKDmABFKDm8ojIzm.proxy 06/23/15 15:53:00 (pid:9432) File transfer failed (status=0). 06/23/15 15:53:00 (pid:9432) ERROR "Failed to transfer files" at line 2301 in file /slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_starter.V6.1/jic_shadow.cpp 06/23/15 15:53:00 (pid:9432) ShutdownFast all jobs. 06/23/15 15:53:01 (pid:9432) condor_read() failed: recv(fd=6) returned -1, errno = 104 Connection reset by peer, reading 21 bytes from <193.109.175.11:58180>. 06/23/15 15:53:01 (pid:9432) IO: Failed to read packet header 06/23/15 15:53:01 (pid:9432) Lost connection to shadow, waiting 1200 secs for reconnect We have been doing some tests, changing the configuration files, playing with LD_LIBRARY_PATH, etc. with no luck. We've found that the libraries are searched in /usr/lib64 directory and not in /usr/lib64/condor that it is the directory where the rpm installed the libraries and the directory defined in our condor_config file. If we downgrade to version 8.2.3, without touching the configuration, everything is working fine again. Are we missing something? Thank you in advance. Best regards, Carles -- Carles Acosta i Silva PIC (Port d'Informació Científica) Campus UAB, Edifici D E-08193 Bellaterra, Barcelona Tel: +34 93 581 33 22 Fax: +34 93 581 41 10 http://www.pic.es Avís - Aviso - Legal Notice: http://www.ifae.es/legal.html |