Hi all, especially Jaime, I'm still having trouble submitting grid-universe jobs to a GT4-WS gatekeeper. Gory details below. On 2 Oct 2006, at 16:58, Jaime Frey wrote: On Sep 29, 2006, at 10:46 AM, Andrew Walker wrote: On 27 Sep 2006, at 16:37, Jaime Frey wrote: On Sep 26, 2006, at 8:30 AM, Andrew Walker wrote: Having recently upgraded from condor 6.6 to 6.8(.0), I'm trying to submit a grid universe gt4 job to a remote gatekeeper in front of a condor pool. Currently my job is failing with the error "Failed to create proxy delegation" (which is Code 0 Subcode 0 in the user log file). Does anybody have any idea how to debug this? The gatekeeper is running globus 4.0.1 and I can successfully submit jobs using the pre-WS gram (using both the gt2 grid universe and the globus universe). At the moment I have pre-staged the executable and am not attempting to recover the output back to the submit machine - all I want to do is run a shell script on a condor node and return the output to the gatekeeper. I think my problem is with the condor-g submit machine, but I have access to log and configuration files at both ends. snip... One possibility is that gridftp is not correctly traversing the firewalls between the gatekeeper and the condor submit machine (I have two firewalls to worry about - both filter traffic in both directions). What are the network requirements for a gt4 resource? I guess the gatekeeper has to connect back to the submitting machine on TCP port 2811. However, I don't think this is the immediate problem as I'm not seeing any activity (or failing outbound network connections) from the gatekeeper. The problem is not with the gridftp server, but with delegating your proxy to the Delegation service on the gatekeeper machine. The best way to debug this is to try Globus' WS GRAM client to submit an equivalent job. Try this: globusrun-ws -submit -job-delegate -factory cartman.niees.group.cam.ac.uk -factory-type Condor -job-command /bin/date This will delegate a credential, then submit a job that uses that credential. If this fails, then you know that the problem is not related to Condor-G. A couple other notes: The 'globus_rsl' attribute doesn't work for WS GRAM jobs. Instead, there's a globus_xml attribute, for use with WS GRAM's XML-based RSL description. The gridftp server Condor-G starts up for WS GRAM file transfers listens on a dynamic port, not 2811. If you have a hole in your firewall and LOWPORT/HIGHPORT set appropriately in your Condor config file, then the gridftp server shouldn't have any problems. Jaime, Thanks for the info - it turned out that this was a firewall issue resolved by moving my tests to a new pair of machines. However, I have now run up against a new problem. (I'm now submitting from a 6.8.1 condor machine to a gatekeeper running globus 4.0.2 in front of a 6.8.1 condor pool; firewalls between the two machines have been set to allow any traffic in either direction free access). I have simplified my script a bit too in order to try and work out what is going on - all I want to see is the hostname of the execute node on the remote condor pool: Universe = grid grid_resource = gt4 cete.niees.group.cam.ac.uk Condor Executable = /bin/hostname Notification = NEVER Output = host_$(PROCESS).out Error = host.err Log = host.log Queue 1 Again the job enters the local queue, the gridftp server starts up and then the job fails and enters the held state. This time I have a different error in the log (Globus error: Staging error for RSL element fileStageIn): 000 (192.000.000) 09/29 16:31:55 Job submitted from host: <131.111.20.163:9661> ... 017 (192.000.000) 09/29 16:32:50 Job submitted to Globus RM-Contact: cete.niees.group.cam.ac.uk JM-Contact: https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?b8486b60-4fcf-11db-ba9e-8b423672fa7f Can-Restart-JM: 0 ... 027 (192.000.000) 09/29 16:32:50 Job submitted to grid resource GridResource: gt4 cete.niees.group.cam.ac.uk Condor GridJobId: gt4 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?b8486b60-4fcf-11db-ba9e-8b423672fa7f ... 012 (192.000.000) 09/29 16:32:53 Job was held. Globus error: Staging error for RSL element fileStageIn. Code 0 Subcode 0 ... However, running the equivalent command using the globus client works (and the returned output file shows that the job ran on a condor execute node): globusrun-ws -streaming -stdout-file testout -submit -job-delegate -factory cete.niees.group.cam.ac.uk -factory-type Condor -job-command /bin/hostname Delegating user credentials...Done. Submitting job...Done. Job ID: uuid:3cc015da-4faa-11db-8c27-00042388e7a7 Termination time: 09/30/2006 11:04 GMT Current job state: Pending Current job state: Active Current job state: CleanUp-Hold Current job state: CleanUp Current job state: Done Destroying job...Done. Cleaning up any delegated credentials...Done. Using condor's GT2 interface also works as expected: Universe = grid grid_resource = gt2 cete.niees.group.cam.ac.uk/jobmanager-condor Executable = /bin/hostname Notification = NEVER Output = host_$(PROCESS).out Error = host.err Log = host.log Queue 1 And I see exactly the same behavior replacing all the condor jobmanager commands with fork commands. Again I'm after some help finding a starting place for debugging. Does anybody have any idea where to start? Condor is trying to transfer /bin/hostname to the GRAM server. globusrun-ws is using the /bin/hostname that's already there. Something about the transfer is failing. You can confirm this by adding 'transfer_executable = false' to your submit file. Interestingly adding "transfer_executable = false" to the submit file gives exactly the same behavior - that is I see: 012 (200.000.000) 10/10 14:35:34 Job was held. Globus error: Staging error for RSL element fileStageIn. Code 0 Subcode 0 in the user log file. Can you transfer files from your submit machine to cete.niees.group.cam.ac.uk using globus-url-copy? This appears to be working correctly. The command: globus-url-copy file:///home/amw75/testfile gsiftp://cete.niees.group.cam.ac.uk:2811/home/andreww/testfile creates a copy of "testfile" on cete. Is that what you wanted me to test? If you have GRIDMANAGER_DEBUG=D_FULLDEBUG in your condor config file, you should see a java exception stack trace in your gridmanager daemon log. That may give us more detail on what exactly is failing. The server's container output should contain the same stack trace. The log is below (and quite long) - job 200 is the submission that ends up in the "held" state, 201 is the gsiftp server that gets started on the submit machine and 128.232.232.28 is cete - the gatekeeper. The interesting line seems to be in the java stack trace: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Error authenticating user at source/dest hostnull. Caused by java.io.EOFException 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.extended.GridFTPInputStream.readMsg(GridFTPInputStream.java:100) but I'm not clear why authentication is failing at this stage. Cheers, Andrew 10/10 14:34:51 ****************************************************** 10/10 14:34:51 ** condor_gridmanager (CONDOR_GRIDMANAGER) STARTING UP 10/10 14:34:51 ** /Condor/RH9/condor-6.8.1-dynamic/sbin/condor_gridmanager 10/10 14:34:51 ** $CondorVersion: 6.8.1 Sep 17 2006 $ 10/10 14:34:51 ** $CondorPlatform: I386-LINUX_RHEL3 $ 10/10 14:34:51 ** PID = 13000 10/10 14:34:51 ** Log last touched time unavailable (No such file or directory) 10/10 14:34:51 ****************************************************** 10/10 14:34:51 Using config source: /home/condor/condor_config 10/10 14:34:51 Using local config sources: 10/10 14:34:51 /home/condor/condor_config.local 10/10 14:34:51 DaemonCore: Command Socket at <131.111.20.163:9652> 10/10 14:34:51 Welcome to the all-singing, all dancing, "amazing" GridManager! 10/10 14:34:51 [13000] Getting monitoring info for pid 13000 10/10 14:34:51 [13000] Checking proxies 10/10 14:34:52 [13000] DaemonCore: in SendAliveToParent() 10/10 14:34:52 [13000] DaemonCore: attempting to connect to '<131.111.20.163:9661>' 10/10 14:34:54 [13000] Received ADD_JOBS signal 10/10 14:34:54 [13000] in doContactSchedd() 10/10 14:34:54 [13000] querying for new jobs 10/10 14:34:54 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && (Managed =!= "ScheddDone") && (((Matched =!= FALSE) && (JobStatus != 5)) || (Managed =?= "External")) 10/10 14:34:54 [13000] Using job type GT4 for job 200.0 10/10 14:34:54 [13000] (200.0) SetJobLeaseTimers() 10/10 14:34:54 [13000] Found job 200.0 --- inserting 10/10 14:34:54 [13000] Fetched 1 new job ads from schedd 10/10 14:34:54 [13000] querying for removed/held jobs 10/10 14:34:54 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:34:54 [13000] Fetched 0 job ads from schedd 10/10 14:34:54 [13000] leaving doContactSchedd() 10/10 14:34:54 [13000] gahp server not up yet, delaying ping 10/10 14:34:54 [13000] *** UpdateLeases called 10/10 14:34:54 [13000] Leases not supported, cancelling timer 10/10 14:34:54 [13000] *** checkDelegation() 10/10 14:34:54 [13000] gahp server not up yet, delaying checkDelegation 10/10 14:34:54 [13000] GridftpServer: Scanning schedd for previously submitted gridftp server jobs 10/10 14:34:54 [13000] GridftpServer: Submitting job for proxy '/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew walker' 10/10 14:34:54 [13000] entering FileTransfer::SimpleInit 10/10 14:34:54 [13000] entering FileTransfer::UploadFiles (final_transfer=0) 10/10 14:34:54 [13000] entering FileTransfer::Upload 10/10 14:34:54 [13000] entering FileTransfer::DoUpload 10/10 14:34:54 [13000] DoUpload: send file /tmp/condor_g_scratch.0x8572df0.6105/grid-mapfile 10/10 14:34:54 [13000] ReliSock::put_file_with_permissions(): going to send permissions 100644 10/10 14:34:54 [13000] put_file: going to send from filename /tmp/condor_g_scratch.0x8572df0.6105/grid-mapfile 10/10 14:34:54 [13000] put_file: Found file size 61 10/10 14:34:54 [13000] put_file: senting 61 bytes 10/10 14:34:54 [13000] ReliSock: put_file: sent 61 bytes 10/10 14:34:54 [13000] DoUpload: send file /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2 10/10 14:34:54 [13000] DoUpload: send file /Condor/Debian/condor/libexec/gridftp_wrapper.sh 10/10 14:34:54 [13000] ReliSock::put_file_with_permissions(): going to send permissions 100755 10/10 14:34:54 [13000] put_file: going to send from filename /Condor/Debian/condor/libexec/gridftp_wrapper.sh 10/10 14:34:54 [13000] put_file: Found file size 111 10/10 14:34:54 [13000] put_file: senting 111 bytes 10/10 14:34:54 [13000] ReliSock: put_file: sent 111 bytes 10/10 14:34:54 [13000] DoUpload: exiting at 2090 10/10 14:34:57 [13000] (200.0) doEvaluateState called: gmState GM_INIT, globusState 32 10/10 14:34:57 [13000] GAHP server pid = 13003 10/10 14:34:58 [13000] GAHP server version: $GahpVersion: 1.4.0 Jun 02 2005 GT4 GAHP (GT-4.0.0) $ 10/10 14:34:58 [13000] GAHP[13003] <- 'COMMANDS' 10/10 14:34:58 [13000] GAHP[13003] -> 'S' 'ASYNC_MODE_OFF' 'ASYNC_MODE_ON' 'CACHE_PROXY_FROM_FILE' 'COMMANDS' 'GASS_SERVER_INIT' 'GT4_DELEGATE_CREDENTIAL' 'GT4_GENERATE_SUBMIT_ID' 'GT4_GRAM_CALLBACK_ALLOW' 'GT4_GRAM_JOB_CALLBACK_REGISTER' 'GT4_GRAM_JOB_DESTROY' 'GT4_GRAM_JOB_START' 'GT4_GRAM_JOB_STATUS' 'GT4_GRAM_JOB_SUBMIT' 'GT4_GRAM_PING' 'GT4_REFRESH_CREDENTIAL' 'GT4_SET_TERMINATION_TIME' 'INITIALIZE_FROM_FILE' 'QUIT' 'REFRESH_PROXY_FROM_FILE' 'RESPONSE_PREFIX' 'RESULTS' 'UNCACHE_PROXY' 'USE_CACHED_PROXY' 'VERSION' 10/10 14:34:58 [13000] GAHP[13003] <- 'RESPONSE_PREFIX GAHP:' 10/10 14:34:58 [13000] GAHP[13003] -> 'S' 10/10 14:34:58 [13000] GAHP[13003] <- 'ASYNC_MODE_ON' 10/10 14:34:58 [13000] GAHP[13003] -> 'S' 10/10 14:34:58 [13000] GAHP[13003] <- 'INITIALIZE_FROM_FILE /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2' 10/10 14:34:59 [13000] GAHP[13003] -> 'S' 10/10 14:34:59 [13000] GAHP[13003] <- 'CACHE_PROXY_FROM_FILE 2 /tmp/condor_g_scratch.0x8572df0.6105/master_proxy.2' 10/10 14:34:59 [13000] GAHP[13003] -> 'S' 10/10 14:34:59 [13000] GAHP[13003] <- 'CACHE_PROXY_FROM_FILE 1 /tmp/x509up_u1501' 10/10 14:34:59 [13000] GAHP[13003] -> 'S' 10/10 14:34:59 [13000] GAHP[13003] <- 'GT4_GRAM_CALLBACK_ALLOW 2' 10/10 14:35:00 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:00 [13000] (200.0) gm state change: GM_INIT -> GM_START 10/10 14:35:00 [13000] (200.0) gm state change: GM_START -> GM_CLEAR_REQUEST 10/10 14:35:00 [13000] (200.0) UpdateJobLeaseSent(-1) 10/10 14:35:00 [13000] (200.0) gm state change: GM_CLEAR_REQUEST -> GM_UNSUBMITTED 10/10 14:35:00 [13000] GridftpServer: Updating job leases for gridftp server jobs 10/10 14:35:00 [13000] GAHP[13003] <- 'GT4_GRAM_PING 3 https://cete.niees.group.cam.ac.uk' 10/10 14:35:00 [13000] GAHP[13003] -> 'S' 10/10 14:35:00 [13000] *** checkDelegation() 10/10 14:35:00 [13000] in doContactSchedd() 10/10 14:35:00 [13000] querying for removed/held jobs 10/10 14:35:00 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:00 [13000] Fetched 0 job ads from schedd 10/10 14:35:00 [13000] 201.0 job status: 2 10/10 14:35:00 [13000] leaving doContactSchedd() 10/10 14:35:00 [13000] (200.0) doEvaluateState called: gmState GM_UNSUBMITTED, globusState 32 10/10 14:35:00 [13000] (200.0) gm state change: GM_UNSUBMITTED -> GM_DELEGATE_PROXY 10/10 14:35:00 [13000] getDelegationError(): failed to find ProxyDelegation for proxy /tmp/x509up_u1501 10/10 14:35:00 [13000] *** getDelegationURI(/tmp/x509up_u1501) 10/10 14:35:00 [13000] creating new ProxyDelegation 10/10 14:35:00 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:00 [13000] GAHP[13003] -> 'R' 10/10 14:35:00 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:00 [13000] GAHP[13003] -> '3' '0' 'NULL' 10/10 14:35:00 [13000] *** checkDelegation() 10/10 14:35:00 [13000] new delegation 10/10 14:35:00 [13000] GAHP[13003] <- 'USE_CACHED_PROXY 1' 10/10 14:35:00 [13000] GAHP[13003] -> 'S' 10/10 14:35:00 [13000] GAHP[13003] <- 'GT4_DELEGATE_CREDENTIAL 4 https://cete.niees.group.cam.ac.uk/wsrf/services/DelegationFactoryService' 10/10 14:35:00 [13000] GAHP[13003] -> 'S' 10/10 14:35:00 [13000] resource https://cete.niees.group.cam.ac.uk is now up 10/10 14:35:00 [13000] (200.0) doEvaluateState called: gmState GM_DELEGATE_PROXY, globusState 32 10/10 14:35:00 [13000] *** getDelegationURI(/tmp/x509up_u1501) 10/10 14:35:00 [13000] found ProxyDelegation 10/10 14:35:06 [13000] DaemonCore::IsPidAlive(): kill returned EPERM, assuming pid 6105 is alive. 10/10 14:35:19 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:19 [13000] GAHP[13003] -> 'R' 10/10 14:35:19 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:19 [13000] GAHP[13003] -> '4' '0' 'https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964' 'NULL' 10/10 14:35:19 [13000] *** checkDelegation() 10/10 14:35:19 [13000] new delegation 10/10 14:35:19 [13000] https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964 10/10 14:35:19 [13000] signalling jobs for https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964 10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_DELEGATE_PROXY, globusState 32 10/10 14:35:19 [13000] *** getDelegationURI(/tmp/x509up_u1501) 10/10 14:35:19 [13000] found ProxyDelegation 10/10 14:35:19 [13000] (200.0) gm state change: GM_DELEGATE_PROXY -> GM_GENERATE_ID 10/10 14:35:19 [13000] GAHP[13003] <- 'GT4_GENERATE_SUBMIT_ID 5 ' 10/10 14:35:19 [13000] GAHP[13003] -> 'S' 10/10 14:35:19 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:19 [13000] GAHP[13003] -> 'R' 10/10 14:35:19 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:19 [13000] GAHP[13003] -> '5' 'uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_GENERATE_ID, globusState 32 10/10 14:35:19 [13000] (200.0) gm state change: GM_GENERATE_ID -> GM_SUBMIT_ID_SAVE 10/10 14:35:19 [13000] in doContactSchedd() 10/10 14:35:19 [13000] querying for removed/held jobs 10/10 14:35:19 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:19 [13000] Fetched 0 job ads from schedd 10/10 14:35:19 [13000] Updating classad values for 200.0: 10/10 14:35:19 [13000] GridftpUrlBase = "gsiftp://holbein.escience.cam.ac.uk:41225" 10/10 14:35:19 [13000] GlobusDelegationUri = "https://128.232.232.28:8443/wsrf/services/DelegationService?4045c610-5864-11db-b222-fa0bb9bed964" 10/10 14:35:19 [13000] GlobusSubmitId = "uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6" 10/10 14:35:19 [13000] leaving doContactSchedd() 10/10 14:35:19 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_ID_SAVE, globusState 32 10/10 14:35:19 [13000] (200.0) gm state change: GM_SUBMIT_ID_SAVE -> GM_SUBMIT 10/10 14:35:19 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_SUBMIT 6 uuid:2bbf41d0-5864-11db-b2c4-d34c60b0c3c6 https://cete.niees.group.cam.ac.uk Condor 1 <job><executable>/bin/date</executable><directory>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</directory><stdout>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6//date_0.out</stdout><stderr>/${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6//date.err</stderr><fileStageIn><maxAttempts>5</maxAttempts><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>gsiftp://holbein.escience.cam.ac.uk:41225/tmp/condor_g_empty_dir_u1501/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}</destinationUrl><rftOptions><sourceSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</sourceSubjectName></rftOptions></transfer><transfer><sourceUrl>gsiftp://holbein.escience.cam.ac.uk:41225/tmp/condor_g_empty_dir_u1501/</sourceUrl><destinationUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</destinationUrl><rftOptions><sourceSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</sourceSubjectName></rftOptions></transfer></fileStageIn><fileStageOut><maxAttempts>5</maxAttempts><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/date_0.out</sourceUrl><destinationUrl>gsiftp://holbein.escience.cam.ac.uk:41225/home/amw75/cete_grid_gt4/date_0.out</destinationUrl><rftOptions><destinationSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</destinationSubjectName></rftOptions></transfer><transfer><sourceUrl>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/date.err</sourceUrl><destinationUrl>gsiftp://holbein.escience.cam.ac.uk:41225/home/amw75/cete_grid_gt4/date.err</destinationUrl><rftOptions><destinationSubjectName>/C=UK/O=eScience/OU=Cambridge/L=UCS/CN=andrew\ walker</destinationSubjectName></rftOptions></transfer></fileStageOut><fileCleanUp><transferCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></transferCredentialEndpoint><deletion><file>file:///${GLOBUS_SCRATCH_DIR}/job_2bbf41d0-5864-11db-b2c4-d34c60b0c3c6/</file></deletion></fileCleanUp><jobCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></jobCredentialEndpoint><stagingCredentialEndpoint\ xsi:type="ns1:EndpointReferenceType"\ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"\ xmlns:ns1="http://schemas.xmlsoap.org/ws/2004/03/addressing"><ns1:Address\ xsi:type="ns1:AttributedURI">https://128.232.232.28:8443/wsrf/services/DelegationService</ns1:Address><ns1:ReferenceProperties\ xsi:type="ns1:ReferencePropertiesType"><ns1:DelegationKey\ xmlns:ns1="http://www.globus.org/08/2004/delegationService">4045c610-5864-11db-b222-fa0bb9bed964</ns1:DelegationKey></ns1:ReferenceProperties><ns1:ReferenceParameters\ xsi:type="ns1:ReferenceParametersType"/></stagingCredentialEndpoint><holdState>StageIn</holdState></job> NULL' 10/10 14:35:19 [13000] GAHP[13003] -> 'S' 10/10 14:35:22 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:22 [13000] GAHP[13003] -> 'R' 10/10 14:35:22 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:22 [13000] GAHP[13003] -> '6' '0' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'NULL' 10/10 14:35:22 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT, globusState 32 10/10 14:35:22 [13000] (200.0) gm state change: GM_SUBMIT -> GM_SUBMIT_SET_LIFETIME 10/10 14:35:22 [13000] Starting sent lease 10/10 14:35:22 [13000] *** (200.0) CalculateLease: new lease should expire at 1160530522 10/10 14:35:22 [13000] GAHP[13003] <- 'GT4_SET_TERMINATION_TIME 7 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6 43200' 10/10 14:35:22 [13000] GAHP[13003] -> 'S' 10/10 14:35:24 [13000] in doContactSchedd() 10/10 14:35:24 [13000] querying for removed/held jobs 10/10 14:35:24 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:24 [13000] Fetched 0 job ads from schedd 10/10 14:35:24 [13000] Updating classad values for 200.0: 10/10 14:35:24 [13000] GridJobId = "gt4 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6" 10/10 14:35:24 [13000] leaving doContactSchedd() 10/10 14:35:24 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:24 [13000] GAHP[13003] -> 'R' 10/10 14:35:24 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:24 [13000] GAHP[13003] -> '7' '0' '1160530523' 'NULL' 10/10 14:35:24 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_SET_LIFETIME, globusState 32 10/10 14:35:24 [13000] (200.0) UpdateJobLeaseSent(1160530523) 10/10 14:35:24 [13000] (200.0) SetJobLeaseTimers() 10/10 14:35:24 [13000] (200.0) gm state change: GM_SUBMIT_SET_LIFETIME -> GM_SUBMIT_SAVE 10/10 14:35:29 [13000] in doContactSchedd() 10/10 14:35:29 [13000] querying for removed/held jobs 10/10 14:35:29 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:29 [13000] Fetched 0 job ads from schedd 10/10 14:35:29 [13000] Updating classad values for 200.0: 10/10 14:35:29 [13000] JobLeaseExpiration = 1160530523 10/10 14:35:29 [13000] leaving doContactSchedd() 10/10 14:35:29 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_SAVE, globusState 32 10/10 14:35:29 [13000] (200.0) gm state change: GM_SUBMIT_SAVE -> GM_SUBMIT_COMMIT 10/10 14:35:29 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_START 8 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 10/10 14:35:29 [13000] GAHP[13003] -> 'S' 10/10 14:35:29 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:29 [13000] GAHP[13003] -> 'R' 10/10 14:35:29 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:29 [13000] GAHP[13003] -> '8' '0' 'NULL' 10/10 14:35:29 [13000] (200.0) doEvaluateState called: gmState GM_SUBMIT_COMMIT, globusState 32 10/10 14:35:29 [13000] (200.0) gm state change: GM_SUBMIT_COMMIT -> GM_SUBMITTED 10/10 14:35:29 [13000] *** (200.0) CalculateLease: no new lease at present 10/10 14:35:31 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:31 [13000] GAHP[13003] -> 'R' 10/10 14:35:31 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:31 [13000] GAHP[13003] -> '2' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'StageIn' 'NULL' '0' 10/10 14:35:31 [13000] (200.0) gram callback: state StageIn, fault (null), exit code 0 10/10 14:35:31 [13000] (200.0) doEvaluateState called: gmState GM_SUBMITTED, globusState 32 10/10 14:35:31 [13000] (200.0) globus state change: Unsubmitted -> StageIn 10/10 14:35:31 [13000] (200.0) Writing globus submit record to user logfile 10/10 14:35:31 [13000] (200.0) Writing grid submit record to user logfile 10/10 14:35:31 [13000] *** (200.0) CalculateLease: no new lease at present 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Full fault for job https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> fault type: org.globus.exec.generated.StagingFaultType: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> attribute: fileStageIn 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> description: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Staging error for RSL element fileStageIn. 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> faultReason: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> faultString: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> gt2ErrorCode: 0 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> originator: Address: https://128.232.232.28:8443/wsrf/services/ManagedJobFactoryService 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Reference property[0]: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> <ns1:ResourceID xmlns:ns1="http://www.globus.org/namespaces/2004/10/gram/job">2bbf41d0-5864-11db-b2c4-d34c60b0c3c6</ns1:ResourceID> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> stackTrace: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> org.globus.exec.generated.StagingFaultType: Staging error for RSL element fileStageIn. 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Timestamp: Tue Oct 10 14:36:05 BST 2006 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Originator: Address: https://128.232.232.28:8443/wsrf/services/ManagedJobFactoryService 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Reference property[0]: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> <ns1:ResourceID xmlns:ns1="http://www.globus.org/namespaces/2004/10/gram/job">2bbf41d0-5864-11db-b2c4-d34c60b0c3c6</ns1:ResourceID> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.reflect.Constructor.newInstance(Constructor.java:274) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.Class.newInstance0(Class.java:308) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.Class.newInstance(Class.java:261) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.utils.FaultUtils.makeFault(FaultUtils.java:485) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.utils.FaultUtils.createStagingFault(FaultUtils.java:363) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.service.exec.StateMachine.processStageInResponseState(StateMachine.java:995) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.reflect.Method.invoke(Method.java:324) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.service.exec.StateMachine.processState(StateMachine.java:367) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.service.exec.RunThread.run(RunThread.java:93) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Error authenticating user at source/dest hostnull. Caused by java.io.EOFException 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.extended.GridFTPInputStream.readMsg(GridFTPInputStream.java:100) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.gsi.gssapi.net.GssInputStream.hasData(GssInputStream.java:81) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.gsi.gssapi.net.GssInputStream.read(GssInputStream.java:55) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(StreamDecoder.java:408) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.nio.cs.StreamDecoder$CharsetSD.implRead(StreamDecoder.java:450) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:182) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.io.InputStreamReader.read(InputStreamReader.java:167) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.io.BufferedReader.fill(BufferedReader.java:136) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.io.BufferedReader.readLine(BufferedReader.java:299) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.io.BufferedReader.readLine(BufferedReader.java:362) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.vanilla.Reply.<init>(Reply.java:66) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.vanilla.FTPControlChannel.read(FTPControlChannel.java:257) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.extended.GridFTPControlChannel.authenticate(GridFTPControlChannel.java:278) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.GridFTPClient.authenticate(GridFTPClient.java:99) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.ftp.GridFTPClient.authenticate(GridFTPClient.java:84) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.transfer.reliable.service.TransferClient.authenticateSource(TransferClient.java:538) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.transfer.reliable.service.TransferClient.authenticate(TransferClient.java:527) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.transfer.reliable.service.TransferWork.getNewClient(TransferWork.java:432) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.transfer.reliable.service.TransferWork.getTransferClient(TransferWork.java:369) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.transfer.reliable.service.TransferWork.run(TransferWork.java:692) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.impl.work.WorkManagerImpl$WorkWrapper.run(WorkManagerImpl.java:345) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.Thread.run(Thread.java:534) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.reflect.Constructor.newInstance(Constructor.java:494) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.Class.newInstance0(Class.java:350) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.Class.newInstance(Class.java:303) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:90) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.ser.BeanDeserializer.<init>(BeanDeserializer.java:76) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.exec.generated.StagingFaultType.getDeserializer(StagingFaultType.java:152) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.reflect.Method.invoke(Method.java:585) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.DeserializationContext.getDeserializerForClass(DeserializationContext.java:510) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.ser.BeanDeserializer.onStartChild(BeanDeserializer.java:250) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.DeserializationContext.startElement(DeserializationContext.java:1035) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.parsers.AbstractSAXParser.startElement(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at javax.xml.parsers.SAXParser.parse(SAXParser.java:375) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.encoding.DeserializationContext.parse(DeserializationContext.java:227) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.encoding.ObjectDeserializer.toObject(ObjectDeserializer.java:59) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at condor.gahp.gt4.JobListener.deliver(JobListener.java:157) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.impl.notification.NotificationConsumerProvider.notify(NotificationConsumerProvider.java:109) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at java.lang.reflect.Method.invoke(Method.java:585) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.providers.java.RPCProvider.invokeMethod(RPCProvider.java:384) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.providers.java.RPCProvider.processMessage(RPCProvider.java:281) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.providers.java.JavaProvider.invoke(JavaProvider.java:319) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.handlers.soap.SOAPService.invoke(SOAPService.java:450) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.apache.axis.server.AxisServer.invoke(AxisServer.java:285) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.container.ServiceThread.doPost(ServiceThread.java:665) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.container.ServiceThread.process(ServiceThread.java:396) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> at org.globus.wsrf.container.ServiceThread.run(ServiceThread.java:300) 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> stateWhenFailureOccurred: StageIn 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> timestamp: java.util.GregorianCalendar[time=1160487365693,areFieldsSet=true,areAllFieldsSet=true,lenient=true,zone=sun.util.calendar.ZoneInfo[id="GMT",offset=0,dstSavings=0,useDaylight=false,transitions=0,lastRule=null],firstDayOfWeek=2,minimalDaysInFirstWeek=4,ERA=1,YEAR=2006,MONTH=9,WEEK_OF_YEAR=41,WEEK_OF_MONTH=2,DAY_OF_MONTH=10,DAY_OF_YEAR=283,DAY_OF_WEEK=3,DAY_OF_WEEK_IN_MONTH=2,AM_PM=1,HOUR=1,HOUR_OF_DAY=13,MINUTE=36,SECOND=5,MILLISECOND=693,ZONE_OFFSET=0,DST_OFFSET=0] 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Message: 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> org.globus.exec.generated.StagingFaultType: Staging error for RSL element fileStageIn. 10/10 14:35:33 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:33 [13000] GAHP[13003] -> 'R' 10/10 14:35:33 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:33 [13000] GAHP[13003] -> '2' 'https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 'Failed' 'Staging error for RSL element fileStageIn.' '0' 10/10 14:35:33 [13000] (200.0) gram callback: state Failed, fault Staging error for RSL element fileStageIn., exit code 0 10/10 14:35:33 [13000] (200.0) doEvaluateState called: gmState GM_SUBMITTED, globusState 64 10/10 14:35:33 [13000] (200.0) globus state change: StageIn -> Failed 10/10 14:35:33 [13000] (200.0) gm state change: GM_SUBMITTED -> GM_FAILED 10/10 14:35:33 [13000] GAHP[13003] <- 'GT4_GRAM_JOB_DESTROY 9 https://128.232.232.28:8443/wsrf/services/ManagedExecutableJobService?2bbf41d0-5864-11db-b2c4-d34c60b0c3c6' 10/10 14:35:33 [13000] GAHP[13003] -> 'S' 10/10 14:35:33 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4 10/10 14:35:33 [13000] GAHP[13003] (stderr) -> Cmd 9: gramJob.cancel() 10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: CallbackSing.getAllCallbackSinks() 10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: iter.removeJobListener() 10/10 14:35:34 [13000] in doContactSchedd() 10/10 14:35:34 [13000] querying for removed/held jobs 10/10 14:35:34 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:34 [13000] Fetched 0 job ads from schedd 10/10 14:35:34 [13000] Updating classad values for 200.0: 10/10 14:35:34 [13000] NumGlobusSubmits = 1 10/10 14:35:34 [13000] GlobusStatus = 4 10/10 14:35:34 [13000] leaving doContactSchedd() 10/10 14:35:34 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4 10/10 14:35:34 [13000] GAHP[13003] (stderr) -> Cmd 9: Done 10/10 14:35:34 [13000] GAHP[13003] <- 'RESULTS' 10/10 14:35:34 [13000] GAHP[13003] -> 'R' 10/10 14:35:34 [13000] GAHP[13003] -> 'S' '1' 10/10 14:35:34 [13000] GAHP[13003] -> '9' '0' 'NULL' 10/10 14:35:34 [13000] (200.0) doEvaluateState called: gmState GM_FAILED, globusState 4 10/10 14:35:34 [13000] (200.0) gm state change: GM_FAILED -> GM_HOLD 10/10 14:35:34 [13000] (200.0) Writing hold record to user logfile 10/10 14:35:34 [13000] (200.0) gm state change: GM_HOLD -> GM_DELETE 10/10 14:35:39 [13000] in doContactSchedd() 10/10 14:35:39 [13000] querying for removed/held jobs 10/10 14:35:39 [13000] Using constraint ((Owner=?="amw75"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 10/10 14:35:39 [13000] Fetched 0 job ads from schedd 10/10 14:35:39 [13000] Updating classad values for 200.0: 10/10 14:35:39 [13000] GlobusDelegationUri = UNDEFINED 10/10 14:35:39 [13000] GridftpUrlBase = UNDEFINED 10/10 14:35:39 [13000] GlobusSubmitId = UNDEFINED 10/10 14:35:39 [13000] GridJobId = UNDEFINED 10/10 14:35:39 [13000] GlobusStatus = 32 10/10 14:35:39 [13000] JobStatus = 5 10/10 14:35:39 [13000] EnteredCurrentStatus = 1160487334 10/10 14:35:39 [13000] HoldReason = "Globus error: Staging error for RSL element fileStageIn." 10/10 14:35:39 [13000] HoldReasonCode = 0 10/10 14:35:39 [13000] HoldReasonSubCode = 0 10/10 14:35:39 [13000] ReleaseReason = UNDEFINED 10/10 14:35:39 [13000] NumSystemHolds = 1 10/10 14:35:39 [13000] Managed = "Schedd" 10/10 14:35:39 [13000] No jobs left, shutting down 10/10 14:35:39 [13000] leaving doContactSchedd() 10/10 14:35:39 [13000] Got SIGTERM. Performing graceful shutdown. 10/10 14:35:39 [13000] Started timer to call main_shutdown_fast in 1800 seconds 10/10 14:35:39 [13000] **** condor_gridmanager (condor_GRIDMANAGER) EXITING WITH STATUS 0 Dr Andrew Walker Department of Earth Sciences University of Cambridge Downing Street Cambridge CB2 3EQ UK phone +44 (0)1223 333432 |