Hi,
the proxy renewal still doesn't work. Condor still updates "Not Before" (= activation timestamp) instead of "Not After" (= expiration timestamp). See job output below. But at least I have found some errors in StarterLog.slot1. The errors appear in the log when Condor tries to renew the proxy for the very first time (30 mins before expiration). Subsequent renewal attempts do not produce any log entries (although they still fail). Do you have an idea what could be the problem? StarterLog.slot1
05/30/18 09:54:54 (pid:25402) Got SIGTERM. Performing graceful shutdown. 05/30/18 09:54:54 (pid:25402) ShutdownGraceful all jobs. 05/30/18 09:54:54 (pid:25402) Process exited, pid=25406, signal=15 05/30/18 09:54:54 (pid:25402) Last process exited, now Starter is exiting 05/30/18 09:54:54 (pid:25402) **** condor_starter (condor_STARTER) pid 25402 EXITING WITH STATUS 0 05/30/18 09:58:37 (pid:23142) ****************************************************** 05/30/18 09:58:37 (pid:23142) ** condor_starter (CONDOR_STARTER) STARTING UP 05/30/18 09:58:37 (pid:23142) ** /usr/sbin/condor_starter 05/30/18 09:58:37 (pid:23142) ** SubsystemInfo: name=STARTER type=STARTER(8) class=DAEMON(1) 05/30/18 09:58:37 (pid:23142) ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON 05/30/18 09:58:37 (pid:23142) ** $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $ 05/30/18 09:58:37 (pid:23142) ** $CondorPlatform: x86_64_RedHat7 $ 05/30/18 09:58:37 (pid:23142) ** PID = 23142 05/30/18 09:58:37 (pid:23142) ** Log last touched 5/30 09:54:54 05/30/18 09:58:37 (pid:23142) ****************************************************** 05/30/18 09:58:37 (pid:23142) Using config source: /etc/condor/condor_config 05/30/18 09:58:37 (pid:23142) Using local config sources: 05/30/18 09:58:37 (pid:23142) /etc/condor/config.d/aixvipmap.conf 05/30/18 09:58:37 (pid:23142) /etc/condor/config.d/slot.conf 05/30/18 09:58:37 (pid:23142) /etc/condor/config.d/slot_user.conf 05/30/18 09:58:37 (pid:23142) /etc/condor/condor_config.local 05/30/18 09:58:37 (pid:23142) config Macros = 167, Sorted = 166, StringBytes = 4850, TablesBytes = 6076 05/30/18 09:58:37 (pid:23142) CLASSAD_CACHING is OFF 05/30/18 09:58:37 (pid:23142) Daemon Log is logging: D_ALWAYS D_ERROR 05/30/18 09:58:37 (pid:23142) SharedPortEndpoint: waiting for connections to named socket 25276_fa7d_4 05/30/18 09:58:37 (pid:23142) DaemonCore: command socket at <137.226.130.81:9618?addrs=137.226.130.81-9618+[--1]-9618&noUDP&sock=25276_fa7d_4> 05/30/18 09:58:37 (pid:23142) DaemonCore: private command socket at <137.226.130.81:9618?addrs=137.226.130.81-9618+[--1]-9618&noUDP&sock=25276_fa7d_4> 05/30/18 09:58:37 (pid:23142) Communicating with shadow <137.226.130.73:9618?addrs=137.226.130.73-9618+[--1]-9618&noUDP&sock=378_15c8_8> 05/30/18 09:58:37 (pid:23142) Submitting machine is "tux223.iehk.rwth-aachen.de" 05/30/18 09:58:37 (pid:23142) setting the orig job name in starter 05/30/18 09:58:37 (pid:23142) setting the orig job iwd in starter 05/30/18 09:58:37 (pid:23142) Chirp config summary: IO false, Updates false, Delayed updates true. 05/30/18 09:58:37 (pid:23142) Initialized IO Proxy. 05/30/18 09:58:37 (pid:23142) Done setting resource limits 05/30/18 09:58:37 (pid:23142) File transfer completed successfully. 05/30/18 09:58:38 (pid:23142) Job 253.0 set to execute immediately 05/30/18 09:58:38 (pid:23142) Starting a VANILLA universe job with ID: 253.0 05/30/18 09:58:38 (pid:23142) IWD: /var/lib/condor/execute/dir_23142 05/30/18 09:58:38 (pid:23142) Output file: streaming to remote file x509-test-token-refresh.output 05/30/18 09:58:38 (pid:23142) Error file: streaming to remote file x509-test-token-refresh.error 05/30/18 09:58:38 (pid:23142) Renice expr "0" evaluated to 0 05/30/18 09:58:38 (pid:23142) About to exec /bin/bash x509-test-token-refresh.sh 05/30/18 09:58:38 (pid:23142) Running job as user lkosch 05/30/18 09:58:38 (pid:23142) Create_Process succeeded, pid=23146 05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 13 bytes to , fd is 15 05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed 05/30/18 10:20:38 (pid:23142) SharedPortEndpoint: failed to send final status (success) for SHARED_PORT_PASS_SOCK 05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 434 bytes to <137.226.130.73:2574>, fd is 25 05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed 05/30/18 10:20:38 (pid:23142) relisock_gsi_get (read from socket) failure 05/30/18 10:20:38 (pid:23142) ReliSock::get_x509_delegation_finish(): delegation failed to complete: x509_receive_delegation failed at line 1697 05/30/18 10:20:38 (pid:23142) rename(x509up_u6854.tmp, x509up_u6854) failed with errno 2 05/30/18 10:20:38 (pid:23142) updateX509Proxy failed, could not rename file 05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 29 bytes to <137.226.130.73:2574>, fd is 25 05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed 05/30/18 10:20:38 (pid:23142) Attempt to refresh X509 proxy FAILED. 05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 29 bytes to <137.226.130.73:32728>, fd is 25 05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed Job output Notice: CEST=GMT+2
-- Wed May 30 09:58:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 07:53:37 2018 GMT # The initial timestamp is okay Not After : May 30 08:54:40 2018 GMT -- Wed May 30 10:33:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 08:25:38 2018 GMT # First renewal at 10:20:38 CEST / 08:20:38 GMT Not After : May 30 08:54:40 2018 GMT # "Not Before" has been updated instead of "Not After" -- Wed May 30 10:43:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 08:35:38 2018 GMT Not After : May 30 08:54:40 2018 GMT -- Wed May 30 10:53:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 08:45:38 2018 GMT Not After : May 30 08:54:40 2018 GMT -- Wed May 30 11:03:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 08:55:38 2018 GMT Not After : May 30 08:54:40 2018 GMT -- Wed May 30 11:13:38 CEST 2018 /var/lib/condor/execute/dir_23142/x509up_u6854 Validity Not Before: May 30 09:05:38 2018 GMT Not After : May 30 08:54:40 2018 GMT Best regards, Lukas --
Lukas Koschmieder Steel Institute IEHK RWTH Aachen University Intzestraße 1 52072 Aachen Germany From: Koschmieder, Lukas
Sent: Tuesday, May 22, 2018 19:03 To: HTCondor-Users Mail List Subject: x509UserProxy not refreshing proxy (correctly) Hi Condor users,
I'm using x509UserProxy to delegate my GSI credentials to my execute nodes. Condor is supposed to refresh the proxy before it expires. But for some reason, the proxy's "expiration date" (Validity > Not Before) is not updated at all. Condor only keeps updating the "activation date" (Validity > Not After) as soon as the remaining time left hits the 30 minutes mark. -- Tue May 22 18:25:43 CEST 2018 Validity Not Before: May 22 17:55:22 2018 CEST Not After : May 22 19:00:10 2018 CEST -- Tue May 22 18:30:43 CEST 2018 Validity Not Before: May 22 17:55:22 2018 CEST Not After : May 22 19:00:10 2018 CEST -- Tue May 22 18:35:43 CEST 2018 Validity Not Before: May 22 18:27:43 2018 CEST Not After : May 22 19:00:10 2018 CEST -- Tue May 22 18:40:43 CEST 2018 Validity Not Before: May 22 18:27:43 2018 CEST Not After : May 22 19:00:10 2018 CEST -- Tue May 22 18:45:43 CEST 2018 Validity Not Before: May 22 18:37:43 2018 CEST Not After : May 22 19:00:10 2018 CEST I've already tried to enable ignore_globus_limited_proxy_flag in /etc/myproxy-server.config. But this doesn't make any difference. Do you have any ideas? This is the test job which produces the above log: #!/bin/bash while true; do echo -- date openssl x509 -in $X509_USER_PROXY -text | grep -A2 Valid sleep 300 done Cheers, Lukas --
Lukas Koschmieder Steel Institute IEHK RWTH Aachen University Intzestraße 1 52072 Aachen Germany |