[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] x509UserProxy not refreshing proxy (correctly)



Hi,

the proxy renewal still doesn't work. Condor still updates "Not Before" (= activation timestamp) instead of "Not After" (= expiration timestamp). See job output below.

But at least I have found some errors in StarterLog.slot1. The errors appear in the log when Condor tries to renew the proxy for the very first time (30 mins before expiration). Subsequent renewal attempts do not produce any log entries (although they still fail). Do you have an idea what could be the problem?

StarterLog.slot1

05/30/18 09:54:54 (pid:25402) Got SIGTERM. Performing graceful shutdown.
05/30/18 09:54:54 (pid:25402) ShutdownGraceful all jobs.
05/30/18 09:54:54 (pid:25402) Process exited, pid=25406, signal=15
05/30/18 09:54:54 (pid:25402) Last process exited, now Starter is exiting
05/30/18 09:54:54 (pid:25402) **** condor_starter (condor_STARTER) pid 25402 EXITING WITH STATUS 0
05/30/18 09:58:37 (pid:23142) ******************************************************
05/30/18 09:58:37 (pid:23142) ** condor_starter (CONDOR_STARTER) STARTING UP
05/30/18 09:58:37 (pid:23142) ** /usr/sbin/condor_starter
05/30/18 09:58:37 (pid:23142) ** SubsystemInfo: name=STARTER type=STARTER(8) class=DAEMON(1)
05/30/18 09:58:37 (pid:23142) ** Configuration: subsystem:STARTER local:<NONE> class:DAEMON
05/30/18 09:58:37 (pid:23142) ** $CondorVersion: 8.6.10 Mar 12 2018 BuildID: 435200 $
05/30/18 09:58:37 (pid:23142) ** $CondorPlatform: x86_64_RedHat7 $
05/30/18 09:58:37 (pid:23142) ** PID = 23142
05/30/18 09:58:37 (pid:23142) ** Log last touched 5/30 09:54:54
05/30/18 09:58:37 (pid:23142) ******************************************************
05/30/18 09:58:37 (pid:23142) Using config source: /etc/condor/condor_config
05/30/18 09:58:37 (pid:23142) Using local config sources:
05/30/18 09:58:37 (pid:23142)    /etc/condor/config.d/aixvipmap.conf
05/30/18 09:58:37 (pid:23142)    /etc/condor/config.d/slot.conf
05/30/18 09:58:37 (pid:23142)    /etc/condor/config.d/slot_user.conf
05/30/18 09:58:37 (pid:23142)    /etc/condor/condor_config.local
05/30/18 09:58:37 (pid:23142) config Macros = 167, Sorted = 166, StringBytes = 4850, TablesBytes = 6076
05/30/18 09:58:37 (pid:23142) CLASSAD_CACHING is OFF
05/30/18 09:58:37 (pid:23142) Daemon Log is logging: D_ALWAYS D_ERROR
05/30/18 09:58:37 (pid:23142) SharedPortEndpoint: waiting for connections to named socket 25276_fa7d_4
05/30/18 09:58:37 (pid:23142) DaemonCore: command socket at <137.226.130.81:9618?addrs=137.226.130.81-9618+[--1]-9618&noUDP&sock=25276_fa7d_4>
05/30/18 09:58:37 (pid:23142) DaemonCore: private command socket at <137.226.130.81:9618?addrs=137.226.130.81-9618+[--1]-9618&noUDP&sock=25276_fa7d_4>
05/30/18 09:58:37 (pid:23142) Communicating with shadow <137.226.130.73:9618?addrs=137.226.130.73-9618+[--1]-9618&noUDP&sock=378_15c8_8>
05/30/18 09:58:37 (pid:23142) Submitting machine is "tux223.iehk.rwth-aachen.de"
05/30/18 09:58:37 (pid:23142) setting the orig job name in starter
05/30/18 09:58:37 (pid:23142) setting the orig job iwd in starter
05/30/18 09:58:37 (pid:23142) Chirp config summary: IO false, Updates false, Delayed updates true.
05/30/18 09:58:37 (pid:23142) Initialized IO Proxy.
05/30/18 09:58:37 (pid:23142) Done setting resource limits
05/30/18 09:58:37 (pid:23142) File transfer completed successfully.
05/30/18 09:58:38 (pid:23142) Job 253.0 set to execute immediately
05/30/18 09:58:38 (pid:23142) Starting a VANILLA universe job with ID: 253.0
05/30/18 09:58:38 (pid:23142) IWD: /var/lib/condor/execute/dir_23142
05/30/18 09:58:38 (pid:23142) Output file: streaming to remote file x509-test-token-refresh.output
05/30/18 09:58:38 (pid:23142) Error file: streaming to remote file x509-test-token-refresh.error
05/30/18 09:58:38 (pid:23142) Renice expr "0" evaluated to 0
05/30/18 09:58:38 (pid:23142) About to exec /bin/bash x509-test-token-refresh.sh
05/30/18 09:58:38 (pid:23142) Running job as user lkosch
05/30/18 09:58:38 (pid:23142) Create_Process succeeded, pid=23146
05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 13 bytes to , fd is 15
05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed
05/30/18 10:20:38 (pid:23142) SharedPortEndpoint: failed to send final status (success) for SHARED_PORT_PASS_SOCK
05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 434 bytes to <137.226.130.73:2574>, fd is 25
05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed
05/30/18 10:20:38 (pid:23142) relisock_gsi_get (read from socket) failure
05/30/18 10:20:38 (pid:23142) ReliSock::get_x509_delegation_finish(): delegation failed to complete: x509_receive_delegation failed at line 1697
05/30/18 10:20:38 (pid:23142) rename(x509up_u6854.tmp, x509up_u6854) failed with errno 2
05/30/18 10:20:38 (pid:23142) updateX509Proxy failed, could not rename file
05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 29 bytes to <137.226.130.73:2574>, fd is 25
05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed
05/30/18 10:20:38 (pid:23142) Attempt to refresh X509 proxy FAILED.
05/30/18 10:20:38 (pid:23142) condor_write(): Socket closed when trying to write 29 bytes to <137.226.130.73:32728>, fd is 25
05/30/18 10:20:38 (pid:23142) Buf::write(): condor_write() failed

Job output

Notice: CEST=GMT+2

--
Wed May 30 09:58:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 07:53:37 2018 GMT  # The initial timestamp is okay
            Not After : May 30 08:54:40 2018 GMT
--
Wed May 30 10:33:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 08:25:38 2018 GMT  # First renewal at 10:20:38 CEST / 08:20:38 GMT
            Not After : May 30 08:54:40 2018 GMT  # "Not Before" has been updated instead of "Not After"
--
Wed May 30 10:43:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 08:35:38 2018 GMT
            Not After : May 30 08:54:40 2018 GMT
--
Wed May 30 10:53:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 08:45:38 2018 GMT
            Not After : May 30 08:54:40 2018 GMT
--
Wed May 30 11:03:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 08:55:38 2018 GMT
            Not After : May 30 08:54:40 2018 GMT
--
Wed May 30 11:13:38 CEST 2018
/var/lib/condor/execute/dir_23142/x509up_u6854
        Validity
            Not Before: May 30 09:05:38 2018 GMT
            Not After : May 30 08:54:40 2018 GMT

Best regards,
Lukas

--
Lukas Koschmieder
Steel Institute IEHK
RWTH Aachen University
Intzestraße 1
52072 Aachen
Germany



From: Koschmieder, Lukas
Sent: Tuesday, May 22, 2018 19:03
To: HTCondor-Users Mail List
Subject: x509UserProxy not refreshing proxy (correctly)
 

Hi Condor users,

I'm using x509UserProxy to delegate my GSI credentials to my execute nodes. Condor is supposed to refresh the proxy before it expires. But for some reason, the proxy's "expiration date" (Validity > Not Before) is not updated at all. Condor only keeps updating the "activation date" (Validity > Not After) as soon as the remaining time left hits the 30 minutes mark.

--
Tue May 22 18:25:43 CEST 2018
        Validity
            Not Before: May 22 17:55:22 2018 CEST
            Not After : May 22 19:00:10 2018 CEST
--
Tue May 22 18:30:43 CEST 2018
        Validity
            Not Before: May 22 17:55:22 2018 CEST
            Not After : May 22 19:00:10 2018 CEST
--
Tue May 22 18:35:43 CEST 2018
        Validity
            Not Before: May 22 18:27:43 2018 CEST
            Not After : May 22 19:00:10 2018 CEST
--
Tue May 22 18:40:43 CEST 2018
        Validity
            Not Before: May 22 18:27:43 2018 CEST
            Not After : May 22 19:00:10 2018 CEST
--
Tue May 22 18:45:43 CEST 2018
        Validity
            Not Before: May 22 18:37:43 2018 CEST
            Not After : May 22 19:00:10 2018 CEST


I've already tried to enable ignore_globus_limited_proxy_flag in /etc/myproxy-server.config. But this doesn't make any difference.

Do you have any ideas?

This is the test job which produces the above log:

#!/bin/bash
while true; do
echo --
date
openssl x509 -in $X509_USER_PROXY -text | grep -A2 Valid
sleep 300
done

Cheers,
Lukas

--
Lukas Koschmieder
Steel Institute IEHK
RWTH Aachen University
Intzestraße 1
52072 Aachen
Germany