Hi again Tim and all,
Finally, we've solved the issue adding condor.conf in ld.so.conf.d:
# cat /etc/ld.so.conf.d/condor.conf
/usr/lib64/condor
After running ldconfig and service condor reload, the worker node
is working fine.
Thank you very much.
Regards,
Carles
On 06/24/2015 05:22 PM, Tim Theisen wrote:
Hi Carles,
Could you send me your configuration files off list? I suspect
that something might be wrong with the new RPM packaging and I'd
like to investigate this problem.
...Tim
On 06/23/2015 09:04 AM, Carles Acosta
wrote:
Dear all,
We have updated from stable condor version 8.2.3 to the
development version 8.3.5 (installing the new condor-all rpm).
We were submitting jobs from an ARC-CE to our condor test
environment in 8.2.3 version without any important issue.
However, we are facing problems with version 8.3.5.
It seems that condor is not finding the libraries placed in
/usr/lib64/condor although it is defined as the LIB directory in
our condor_config file:
# cat condor_config | egrep "RELEASE|LIB" | grep -v ^#
RELEASE_DIR = /usr
BIN = $(RELEASE_DIR)/bin
LIB = $(RELEASE_DIR)/lib64/condor
INCLUDE = $(RELEASE_DIR)/include/condor
SBIN = $(RELEASE_DIR)/sbin
LIBEXEC = $(RELEASE_DIR)/libexec/condor
SHARE = $(RELEASE_DIR)/share/condor
In the ShadowLog of the schedd, we can see:
06/23/15 15:48:44 (228.0) (24598): Request to run on slot1@xxxxxxxxxxxx
<192.168.101.5:41149> was ACCEPTED
06/23/15 15:48:44 (225.0) (24572): ERROR "Error from slot1@xxxxxxxxxxxx:
Failed to transfer files" at line 562 in file
/slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_shadow.V6.1/pseudo_ops.cpp
06/23/15 15:48:44 (225.0) (24572):
ReliSock::put_x509_delegation(): delegation failed:
x509_send_delegation failed at line 1422
06/23/15 15:48:45 (225.0) (24572): DoUpload: SHADOW at
193.109.175.11 failed to send file(s) to
<192.168.101.5:36996>: error sending
/var/spool/arc/jobstatus/job.EOXNDm5ccRmnl6QwDoplpaQmABFKDmABFKDm3hIKDmABFKDm86o53m.proxy
06/23/15 15:48:45 (228.0) (24598): ERROR "Error from slot1@xxxxxxxxxxxx:
Failed to transfer files" at line 562 in file
/slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_shadow.V6.1/pseudo_ops.cpp
These are the errors in the StarterLog of the execute machine:
06/23/15 15:53:00 (pid:9435) ReliSock::get_x509_delegation():
delegation failed: Failed to open GSI libraries:
libglobus_common.so.0: cannot open shared object file: No such
file or directory
06/23/15 15:53:00 (pid:9435) DoDownload: STARTER at
192.168.101.5 failed to receive file
/home/execute/dir_9432/job.IvCMDm9mdRmnl6QwDoplpaQmABFKDmABFKDmIwLKDmABFKDm8ojIzm.proxy
06/23/15 15:53:00 (pid:9432) File transfer failed (status=0).
06/23/15 15:53:00 (pid:9432) ERROR "Failed to transfer files" at
line 2301 in file
/slots/02/dir_64384/userdir/.tmpmTpznX/BUILD/condor-8.3.5/src/condor_starter.V6.1/jic_shadow.cpp
06/23/15 15:53:00 (pid:9432) ShutdownFast all jobs.
06/23/15 15:53:01 (pid:9432) condor_read() failed: recv(fd=6)
returned -1, errno = 104 Connection reset by peer, reading 21
bytes from <193.109.175.11:58180>.
06/23/15 15:53:01 (pid:9432) IO: Failed to read packet header
06/23/15 15:53:01 (pid:9432) Lost connection to shadow, waiting
1200 secs for reconnect
We have been doing some tests, changing the configuration files,
playing with LD_LIBRARY_PATH, etc. with no luck. We've found
that the libraries are searched in /usr/lib64 directory and not
in /usr/lib64/condor that it is the directory where the rpm
installed the libraries and the directory defined in our
condor_config file.
If we downgrade to version 8.2.3, without touching the
configuration, everything is working fine again.
Are we missing something?
Thank you in advance.
Best regards,
Carles
--
Carles Acosta i Silva
PIC (Port d'Informació Científica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avís - Aviso - Legal Notice: http://www.ifae.es/legal.html
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Tim Theisen
Release Manager
HTCondor & Open Science Grid
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
--
Carles Acosta i Silva
PIC (Port d'Informació Científica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 22
Fax: +34 93 581 41 10
http://www.pic.es
Avís - Aviso - Legal Notice: http://www.ifae.es/legal.html
|