Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] a job submited to condor-g never stops running
- Date: Tue, 11 Jul 2006 18:03:45 -0400
- From: Olga Kornievskaia <aglo@xxxxxxxxxxxxxx>
- Subject: [Condor-users] a job submited to condor-g never stops running
Hi,
Has anybody ever had a problem like this. I submit the following script:
universe = grid
grid_type = gt2
globusscheduler = zen.citi.umich.edu/jobmanager-condor
executable = sh_loop
arguments = 600
x509userproxy = /tmp/x509_proxy_aglo
should_transfer_files = true
whentotransferoutput = on_exit
MyProxyHost = yoga.citi.umich.edu:7512
MyProxyPassword = foobar
MyProxyServerDN = /C=US/ST=Michigan/L=Ann Arbor/O=University of
Michigan/OU=CITI Production
KCA/CN=myproxy/yoga.citi.umich.edu/emailAddress=aglo@xxxxxxxxxxxxxx
MyProxyNewProxyLifetime = 5
MyProxyRefreshThreshold = 180
error = script6.err
output = script6.out
log = script6.log
queue
The job is successfully starts and finishes in condor. StarterLog has:
7/11 17:19:33 Starting a VANILLA universe job with ID: 42.0
7/11 17:19:33 IWD: /home/aglo/gram_scratch_UW0W4vAkqk
7/11 17:19:33 Output file:
/home/aglo/.globus/job/zen.citi.umich.edu/28687.1152652478/stdout
7/11 17:19:33 Error file:
/home/aglo/.globus/job/zen.citi.umich.edu/28687.1152652478/stderr
7/11 17:19:33 Using wrapper /usr/local/bin/spkm3-wrapper to exec
condor_exec.exe
/home/aglo/.globus/.gass_cache/local/md5/32/0e2d7cf69a8d3a612c2a49374d3087/md5/5a/53a49f318032335ea7bf518346b3f4/data
600
7/11 17:19:33 Create_Process succeeded, pid=28812
7/11 17:29:41 Process exited, pid=28812, status=0
7/11 17:29:42 Got SIGQUIT. Performing fast shutdown.
7/11 17:29:42 ShutdownFast all jobs.
7/11 17:29:42 **** condor_starter (condor_STARTER) EXITING WITH STATUS 0
Yet. the job's status in condor's queue never changes from "running".
GridmanagerLog happily spews out a continum of messages that contain
what looks like queries for results:
7/11 17:59:30 [28682] Using constraint
((Owner=?="aglo"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) &&
(JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?=
"External"))
7/11 17:59:30 [28682] Fetched 0 job ads from schedd
7/11 17:59:30 [28682] leaving doContactSchedd()
7/11 17:59:33 [28682] GAHP[28683] <- 'RESULTS'
7/11 17:59:33 [28682] GAHP[28683] -> 'S' '0'
The job never gets done. Any ideas why? Thanks.