Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] error delegating credential to startd: delegateX509Proxy
- Date: Fri, 12 Jan 2007 17:53:31 -0600
- From: Renzo Borgatti <borgatti@xxxxxxxx>
- Subject: [Condor-users] error delegating credential to startd: delegateX509Proxy
Hi,
quite all sections running on our Condor pool (6.9.1, gLExec
activated) suffer from frequent restarts (10-15/hour) of the start
daemon on the worker node. In the ShadowLog is visible the error
below. Do you have some idea on the reason of this problem?
1/12 17:32:40 (167275.0) (18120): UserLog = /export/CafCondor/cafIn/
submit_fabhap_medium_218532_8401/job.log
1/12 17:32:40 (167275.0) (18120): *** Reserved Swap = 5120
1/12 17:32:40 (167275.0) (18120): *** Free Swap = 2070936
1/12 17:32:40 (167275.0) (18120): in RemoteResource::initStartdInfo()
1/12 17:32:40 (167275.0) (18120): Granting remote host
"131.225.212.184" (<131.225.212.184:33443>) WRITE and DAEMON
permission.
1/12 17:32:40 (167275.0) (18120): trying early delegation (for
glexec) of proxy: /export/CafCondor/tickets/x509cc_fabhap
1/12 17:32:40 (167275.0) (18120): Entering DCStartd::delegateX509Proxy()
1/12 17:32:40 (167673.0) (1479): Proxy timestamps: remote estimated
1168634306, local 1168558121 (-76185 difference)
1/12 17:32:40 (167275.0) (18120): attempt to connect to
<131.225.212.184:33443> failed: Connection refused (connect errno =
111).
1/12 17:32:40 (167275.0) (18120): error delegating credential to
startd: DCStartd::delegateX509Proxy: Failed to send command
DELEGATE_GSI_CRED_STARTD to the startd
1/12 17:32:40 (167275.0) (18120): Entering DCStartd::activateClaim()
1/12 17:32:40 (167275.0) (18120): attempt to connect to
<131.225.212.184:33443> failed: Connection refused (connect errno =
111).
1/12 17:32:40 (167275.0) (18120): DCStartd::activateClaim: Failed to
send command ACTIVATE_CLAIM to the startd
1/12 17:32:40 (167275.0) (18120): setting exit reason on
vm2@8302@fcdfcaf1035.fnal.gov to 108
1/12 17:32:40 (167275.0) (18120): Resource
vm2@8302@fcdfcaf1035.fnal.gov changing state from PRE to FINISHED
1/12 17:32:40 (167275.0) (18120): Job 167275.0 is being evicted
1/12 17:32:40 (167275.0) (18120): Entering DCStartd::deactivateClaim
(forceful)
1/12 17:32:40 (167275.0) (18120): attempt to connect to
<131.225.212.184:33443> failed: Connection refused (connect errno =
111).
1/12 17:32:40 (167275.0) (18120): RemoteResource::killStarter():
Could not send command to startd
1/12 17:32:40 (167275.0) (18120): logEvictEvent with unknown reason
(108), aborting
1/12 17:32:40 (167275.0) (18120): STARTCOMMAND: starting 1111 to
<131.225.240.106:32903> on TCP port 45499.
1/12 17:32:40 (167275.0) (18120): SECMAN: command 1111 to
<131.225.240.106:32903> on TCP port 45499 (blocking).
1/12 17:32:40 (167275.0) (18120): SECMAN: no cached key for
{<131.225.240.106:32903>,<1111>}.
Many Thanks
Renzo