Hi again,
I am really lost. I cannot find the reason why my condor-g job
submission does not work on one machine. The two machines I use are
called
gtx01 and gtx02.
Here a table that shows in detail the situation:
condor_submit using grid resource using
condor-G
on WS-GRAM
gridmanager
on on result
-----------------------------------------------------------------------------------------
a) gtx02
gtx02 gtx02 OK
b) gtx01
gtx01 gtx01 does
not work
c) gtx02
gtx01
gtx02 OK
In case (c) I see that the gt4 container on gtx01 logs authorization
events:
2007-04-17 12:54:10,502 INFO authorization.ServiceAuthorizationChain
[ServiceThread-16,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/08/2004/delegationService}requestSecurityToken".
2007-04-17 12:54:11,031 INFO authorization.ServiceAuthorizationChain
[ServiceThread-11,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}createReliableFileTransfer".
2007-04-17 12:54:11,573 INFO authorization.ServiceAuthorizationChain
[ServiceThread-10,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}subscribe".
2007-04-17 12:54:11,770 INFO authorization.ServiceAuthorizationChain
[ServiceThread-16,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}setTerminationTime".
2007-04-17 12:54:11,888 INFO authorization.ServiceAuthorizationChain
[ServiceThread-11,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke "{http://www.globus.org/namespaces/2004/10/rft}start".
2007-04-17 12:54:41,544 INFO authorization.ServiceAuthorizationChain
[ServiceThread-11,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/08/2004/delegationService}requestSecurityToken".
2007-04-17 12:54:42,113 INFO authorization.ServiceAuthorizationChain
[ServiceThread-10,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}createReliableFileTransfer".
2007-04-17 12:54:42,862 INFO authorization.ServiceAuthorizationChain
[ServiceThread-16,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}subscribe".
2007-04-17 12:54:43,016 INFO authorization.ServiceAuthorizationChain
[ServiceThread-11,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke
"{http://www.globus.org/namespaces/2004/10/rft}setTerminationTime".
2007-04-17 12:54:43,192 INFO authorization.ServiceAuthorizationChain
[ServiceThread-10,authorize:285] Authorized
"/O=Grid/OU=GlobusTest/OU=simpleCA-gtx01.esrf.fr/OU=esrf.fr/CN=W.-D
Klotz" to invoke "{http://www.globus.org/namespaces/2004/10/rft}start".
In case (b), when I submit the same job from gtx01 to the WS-GRAM on
the same machine I NEVER SEE A LOG OUTPUT
FROM the GLOBUS-CONTAINER. There seems to be no connection between the
globus container and condor-g.
I do not know where to look else for this failure !!!!
(see the former message below for more details)
Thanks for any help ......
WD Klotz
Hi,
I have two identical installations: same hardware, same OS (Suse90),
globus 4.0.3, condor 6.7.19 I386-LINUX_RH9. On both machines I
installed WS-GRAM. Machine gtx01.esrf.fr is condor central manager with
startd and schedd, machine gtx02.esrf.fr is condor node with startd and
schedd. When I submit a condor-G job on gtx02, everything works fine.
Same thing on gtx01 does not work.
Here the job description file:
universe = grid
grid_resource = gt4 https://gtx02.esrf.fr:8443/wsrf/services/ManagedJobFactoryService
Condor
executable = /users/klotz/pybench/pybench.py
transfer_executable = False
output = $(Cluster).$(Process).pybench.globus.out
stream_output = False
error = $(Cluster).$(Process).pybench.globus.err
stream_error = False
log = $(Cluster).pybench.globus.log
notification = Error
requirements = (Arch == "x86_64")
rank = Mips
initialdir = /users/klotz/tmp
queue
Here a table that shows in detail the situation:
condor_submit using grid resource using
condor-G
on WS-GRAM
gridmanager
on on result
-----------------------------------------------------------------------------------------
gtx02
gtx02 gtx02 OK
gtx01
gtx01 gtx01 does
not work
gtx02
gtx01
gtx02 OK
On both machines I logged gridmanager (with
GRIDMANAGER_DEBUG=D_FULLDEBUG) . The log files start to deviate when
the job is submitted to WS-GRAM by the gahp server.
Here the output where things go well:
VVVVVVVVVVVVVVVVVVVVVVVVVV
4/16 01:11:33 [18223] found ProxyDelegation
4/16 01:11:33 [18223] GAHP[18272] <- 'RESULTS'
4/16 01:11:33 [18223] GAHP[18272] -> 'R'
4/16 01:11:33 [18223] GAHP[18272] -> 'S' '1'
4/16 01:11:33 [18223] GAHP[18272] -> '3' '0'
'https://160.103.6.173:8443/wsrf/services/DelegationService?a6fb09d0-eba6-11db-b037-fdb6d96494eb'
'NULL'
4/16 01:11:33 [18223] *** checkDelegation()
4/16 01:11:33 [18223] new delegation
4/16 01:11:33 [18223] https://160.103.6.173:8443/wsrf/services/DelegationService?a6fb09d0-eba6-11db-b037-fdb6d96494eb
4/16 01:11:33 [18223] signalling jobs for https://160.103.6.173:8443/wsrf/services/DelegationService?a6fb09d0-eba6-11db-b037-fdb6d96494eb
4/16 01:11:33 [18223] (2684.0) doEvaluateState called: gmState
GM_DELEGATE_PROXY, globusState 32
4/16 01:11:33 [18223] *** getDelegationURI(/tmp/x509up_u202)
4/16 01:11:33 [18223] found ProxyDelegation
4/16 01:11:33 [18223] (2684.0) gm state change: GM_DELEGATE_PROXY ->
GM_GENERATE_ID
4/16 01:11:33 [18223] GAHP[18272] <- 'GT4_GENERATE_SUBMIT_ID 5 '
4/16 01:11:33 [18223] GAHP[18272] -> 'S'
4/16 01:11:33 [18223] GAHP[18272] <- 'RESULTS'
4/16 01:11:33 [18223] GAHP[18272] -> 'R'
4/16 01:11:33 [18223] GAHP[18272] -> 'S' '1'
4/16 01:11:33 [18223] GAHP[18272] -> '5'
'uuid:a7108da0-eba6-11db-8c87-828f9e54ef05'
4/16 01:11:33 [18223] (2684.0) doEvaluateState called: gmState
GM_GENERATE_ID, globusState 32
4/16 01:11:33 [18223] (2684.0) gm state change: GM_GENERATE_ID ->
GM_SUBMIT_ID_SAVE
4/16 01:11:33 [18223] in doContactSchedd()
4/16 01:11:33 [18223] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 01:11:33 [18223] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 01:11:33 [18223] querying for removed/held jobs
4/16 01:11:33 [18223] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 01:11:33 [18223] Fetched 0 job ads from schedd
4/16 01:11:33 [18223] Updating classad values for 2684.0:
4/16 01:11:33 [18223] GridftpUrlBase = "gsiftp://gtx02.esrf.fr"
4/16 01:11:33 [18223] GlobusDelegationUri = "https://160.103.6.173:8443/wsrf/services/DelegationService?a6fb09d0-eba6-11db-b037-fdb6d96494eb"
4/16 01:11:33 [18223] GlobusSubmitId =
"uuid:a7108da0-eba6-11db-8c87-828f9e54ef05"
4/16 01:11:33 [18223] leaving doContactSchedd()
4/16 01:11:33 [18223] (2684.0) doEvaluateState called: gmState
GM_SUBMIT_ID_SAVE, globusState 32
4/16 01:11:33 [18223] (2684.0) gm state change: GM_SUBMIT_ID_SAVE ->
GM_SUBMIT
4/16 01:11:33 [18223] GAHP[18272] <- 'USE_CACHED_PROXY 1'
4/16 01:11:33 [18223] GAHP[18272] -> 'S'
4/16 01:11:33 [18223] GAHP[18272] <- 'GT4_GRAM_JOB_SUBMIT 6
uuid:a7108da0-eba6-11db-8c87-828f9e54ef05 https://gtx02.esrf.fr:8443/wsrf/services/ManagedJobFactoryService
Condor 1
<job><executable>/users/klotz/pybench/pybench.py</executable><directory>/${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05/</directory><argument>-n</argument><argument>100</argument><stdout>/${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05//2684.0.pybench.globus.out</stdout><stderr>/${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05//2684.0.pybench.globus.err</stderr><fileStageIn><maxAttempts>5</maxAttempts><transferCredentialEndpoint\
xsi:type="ns1:EndpointReferenceType"\
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\
xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\
xsi:type="ns1:AttributedURI">https://160.103.6.173:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\
xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\
xmlns:ns1="http://www.globus.org/08/2004/delegationService">a6fb09d0-eba6-11db-b037-fdb6d96494eb</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\
xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>gsiftp://gtx02.esrf.fr/tmp/condor_g_scratch.0x85c9368.2988/empty_dir_u202/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}</destinationUrl></transfer><transfer><sourceUrl>gsiftp://gtx02.esrf.fr/tmp/condor_g_scratch.0x85c9368.2988/empty_dir_u202/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05/</destinationUrl></transfer></fileStageIn><fileStageOut><maxAttempts>5</maxAttempts><transferCredentialEndpoint\
xsi:type="ns1:EndpointReferenceType"\
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\
xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\
xsi:type="ns1:AttributedURI">https://160.103.6.173:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\
xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\
xmlns:ns1="http://www.globus.org/08/2004/delegationService">a6fb09d0-eba6-11db-b037-fdb6d96494eb</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\
xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05/2684.0.pybench.globus.out</sourceUrl><destinationUrl>gsiftp://gtx02.esrf.fr/users/klotz/tmp/2684.0.pybench.globus.out</destinationUrl></transfer><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05/2684.0.pybench.globus.err</sourceUrl><destinationUrl>gsiftp://gtx02.esrf.fr/users/klotz/tmp/2684.0.pybench.globus.err</destinationUrl></transfer></fileStageOut><fileCleanUp><transferCredentialEndpoint\
xsi:type="ns1:EndpointReferenceType"\
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\
xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\
xsi:type="ns1:AttributedURI">https://160.103.6.173:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\
xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\
xmlns:ns1="http://www.globus.org/08/2004/delegationService">a6fb09d0-eba6-11db-b037-fdb6d96494eb</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\
xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><deletion><file>file:///${GLOBUS_SCRATCH_DIR}/job_a7108da0-eba6-11db-8c87-828f9e54ef05/</file></deletion></fileCleanUp><jobCredentialEndpoint\
xsi:type="ns1:EndpointReferenceType"\
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\
xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\
xsi:type="ns1:AttributedURI">https://160.103.6.173:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\
xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\
xmlns:ns1="http://www.globus.org/08/2004/delegationService">a6fb09d0-eba6-11db-b037-fdb6d96494eb</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\
xsi:type="ns1:ReferenceParametersType"/></jobCredentialEndpoint><stagingCredentialEndpoint\
xsi:type="ns1:EndpointReferenceType"\
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\
xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\
xsi:type="ns1:AttributedURI">https://160.103.6.173:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\
xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\
xmlns:ns1="http://www.globus.org/08/2004/delegationService">a6fb09d0-eba6-11db-b037-fdb6d96494eb</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\
xsi:type="ns1:ReferenceParametersType"/></stagingCredentialEndpoint><holdState>StageIn</holdState></job>
NULL'
4/16 01:11:33 [18223] GAHP[18272] -> 'S'
4/16 01:11:36 [18223] GAHP[18272] <- 'RESULTS'
4/16 01:11:36 [18223] GAHP[18272] -> 'R'
4/16 01:11:36 [18223] GAHP[18272] -> 'S' '1'
4/16 01:11:36 [18223] GAHP[18272] -> '6' '0'
'https://160.103.6.173:8443/wsrf/services/ManagedExecutableJobService?a7108da0-eba6-11db-8c87-828f9e54ef05'
'NULL'
4/16 01:11:36 [18223] (2684.0) doEvaluateState called: gmState
GM_SUBMIT, globusState 32
4/16 01:11:36 [18223] (2684.0) gm state change: GM_SUBMIT ->
GM_SUBMIT_SET_LIFETIME
4/16 01:11:36 [18223] Starting sent lease
4/16 01:11:36 [18223] *** (2684.0) CalculateLease: new lease should
expire at 1176721896
4/16 01:11:36 [18223] GAHP[18272] <- 'GT4_SET_TERMINATION_TIME 7 https://160.103.6.173:8443/wsrf/services/ManagedExecutableJobService?a7108da0-eba6-11db-8c87-828f9e54ef05
43200'
4/16 01:11:36 [18223] GAHP[18272] -> 'S'
4/16 01:11:36 [18223] GAHP[18272] <- 'RESULTS'
4/16 01:11:36 [18223] GAHP[18272] -> 'R'
4/16 01:11:36 [18223] GAHP[18272] -> 'S' '1'
4/16 01:11:36 [18223] GAHP[18272] -> '7' '0' '1176721896' 'NULL'
4/16 01:11:36 [18223] (2684.0) doEvaluateState called: gmState
GM_SUBMIT_SET_LIFETIME, globusState 32
4/16 01:11:36 [18223] (2684.0) UpdateJobLeaseSent(1176721896)
4/16 01:11:36 [18223] (2684.0) SetJobLeaseTimers()
....and so on......
and here the output where the job is never submitted to WS-GRAM:
VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV
4/16 00:51:00 [16900] found ProxyDelegation
4/16 00:51:07 [16900] DaemonCore::IsPidAlive(): kill returned EPERM,
assuming pid 25846 is alive.
4/16 00:51:52 [16900] Received CHECK_LEASES signal
4/16 00:51:52 [16900] Evaluating periodic job policy expressions.
4/16 00:51:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:51:52 [16900] in doContactSchedd()
4/16 00:51:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:51:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:51:52 [16900] querying for renewed leases
4/16 00:51:52 [16900] querying for removed/held jobs
4/16 00:51:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:51:52 [16900] Fetched 0 job ads from schedd
4/16 00:51:52 [16900] leaving doContactSchedd()
4/16 00:51:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:51:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:52:52 [16900] Received CHECK_LEASES signal
4/16 00:52:52 [16900] Evaluating periodic job policy expressions.
4/16 00:52:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:52:52 [16900] in doContactSchedd()
4/16 00:52:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:52:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:52:52 [16900] querying for renewed leases
4/16 00:52:52 [16900] querying for removed/held jobs
4/16 00:52:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:52:52 [16900] Fetched 0 job ads from schedd
4/16 00:52:52 [16900] leaving doContactSchedd()
4/16 00:52:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:52:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:53:07 [16900] DaemonCore::IsPidAlive(): kill returned EPERM,
assuming pid 25846 is alive.
4/16 00:53:52 [16900] Received CHECK_LEASES signal
4/16 00:53:52 [16900] Evaluating periodic job policy expressions.
4/16 00:53:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:53:52 [16900] in doContactSchedd()
4/16 00:53:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:53:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:53:52 [16900] querying for renewed leases
4/16 00:53:52 [16900] querying for removed/held jobs
4/16 00:53:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:53:52 [16900] Fetched 0 job ads from schedd
4/16 00:53:52 [16900] leaving doContactSchedd()
4/16 00:53:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:53:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:54:52 [16900] Getting monitoring info for pid 16900
4/16 00:54:52 [16900] Received CHECK_LEASES signal
4/16 00:54:52 [16900] Evaluating periodic job policy expressions.
4/16 00:54:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:54:52 [16900] in doContactSchedd()
4/16 00:54:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:54:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:54:52 [16900] querying for renewed leases
4/16 00:54:52 [16900] querying for removed/held jobs
4/16 00:54:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:54:52 [16900] Fetched 0 job ads from schedd
4/16 00:54:52 [16900] leaving doContactSchedd()
4/16 00:54:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:54:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:55:07 [16900] DaemonCore::IsPidAlive(): kill returned EPERM,
assuming pid 25846 is alive.
4/16 00:55:52 [16900] Received CHECK_LEASES signal
4/16 00:55:52 [16900] Evaluating periodic job policy expressions.
4/16 00:55:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:55:52 [16900] in doContactSchedd()
4/16 00:55:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:55:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:55:52 [16900] querying for renewed leases
4/16 00:55:52 [16900] querying for removed/held jobs
4/16 00:55:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:55:52 [16900] Fetched 0 job ads from schedd
4/16 00:55:52 [16900] leaving doContactSchedd()
4/16 00:55:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:55:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:55:57 [16900] *** checkDelegation()
4/16 00:55:57 [16900] new delegation
4/16 00:55:58 [16900] *** checkDelegation()
4/16 00:55:58 [16900] new delegation
4/16 00:55:58 [16900]
delegate_credentials(https://gtx01.esrf.fr:8443/wsrf/services/DelegationFactoryService)
failed!
4/16 00:56:52 [16900] Received CHECK_LEASES signal
4/16 00:56:52 [16900] Evaluating periodic job policy expressions.
4/16 00:56:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:56:52 [16900] in doContactSchedd()
4/16 00:56:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:56:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:56:52 [16900] querying for renewed leases
4/16 00:56:52 [16900] querying for removed/held jobs
4/16 00:56:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
4/16 00:56:52 [16900] Fetched 0 job ads from schedd
4/16 00:56:52 [16900] leaving doContactSchedd()
4/16 00:56:55 [16900] GAHP[16901] <- 'RESULTS'
4/16 00:56:55 [16900] GAHP[16901] -> 'S' '0'
4/16 00:57:07 [16900] DaemonCore::IsPidAlive(): kill returned EPERM,
assuming pid 25846 is alive.
4/16 00:57:52 [16900] Received CHECK_LEASES signal
4/16 00:57:52 [16900] Evaluating periodic job policy expressions.
4/16 00:57:52 [16900] TOUCH_LOG_INTERVAL is undefined, using default
value of 60
4/16 00:57:52 [16900] in doContactSchedd()
4/16 00:57:52 [16900] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined,
using default value of 0
4/16 00:57:52 [16900] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
4/16 00:57:52 [16900] querying for renewed leases
4/16 00:57:52 [16900] querying for removed/held jobs
4/16 00:57:52 [16900] Using constraint
((Owner=?="klotz"&&JobUniverse==9)) && ((Managed =!=
"ScheddDone")) && (JobStatus == 3 || JobStatus == 4 ||
(JobStatus == 5 && Managed =?= "External"))
...and so on for ever .....
condor_q -globus on the machine where the job stays in state 'I' for
ever shows "UNSUBMITTED".
Does anybody know why it works on one machine and not on the other?
Can anybody tell me how gahp communicates with WS-GRAM? or get more
logging from gahp?
I would be glad to get a hint where to look for to settle this problem.
Cheers.....
--
Dr.W-D Klotz - Europ. Synch. Rad. Facility (ESRF) - 6 r Jules Horowitz,
BP 220, 38043 Grenoble, FRANCE
work: +33(0)4.76.88.29.21 fax:...24.27 mobile: +33(0)6.87.38.59.27
mail: wdklotz@xxxxxxxxx or klotz@xxxxxxx chat: skype
Please avoid sending me Word(.doc) or PowerPoint(.ppt) attachments.
--
Dr.W-D Klotz - Europ. Synch. Rad. Facility (ESRF) - 6 r Jules Horowitz,
BP 220, 38043 Grenoble, FRANCE
work: +33(0)4.76.88.29.21 fax:...24.27
mobile: +33(0)6.87.38.59.27
mail: wdklotz@xxxxxxxxx or klotz@xxxxxxx
chat: skype
Please avoid sending me Word(.doc) or PowerPoint(.ppt) attachments.
|