Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Job Status not Updated When Using The BOINC GAHP
- Date: Tue, 09 Aug 2016 19:51:34 +0200
- From: Laurence Field <Laurence.Field@xxxxxxx>
- Subject: [HTCondor-users] Job Status not Updated When Using The BOINC GAHP
When using the BOINC gahp the job status is still not being updated. The
following is a detailed explanation of the situation and it leads to a
few questions. Does anyone have any idea why the first requests have
min_mod_time = 0 and the subsequent ones have this value as a
timestamps? Why even though the first requests seem to return the status
are the HTCondor job statuses not being updated?
The condor_q command shows the jobs being idle.
-- Schedd: boinc-submitter.cern.ch : <188.184.165.253:13440?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
17.0 test 7/31 21:22 0+00:00:00 I 0 0.0 Sixtrack
18.0 test 8/1 12:10 0+00:00:00 I 0 0.0 Sixtrack
19.0 test 8/1 13:02 0+00:00:00 I 0 0.0 Sixtrack
20.0 test 8/1 13:06 0+00:00:00 I 0 0.0 Sixtrack
21.0 test 8/1 13:10 0+00:00:00 I 0 0.0 Sixtrack
22.0 test 8/2 00:03 0+00:00:00 I 0 0.0 Sixtrack
23.0 test 8/2 10:21 0+00:00:00 I 0 0.0 Sixtrack
26.0 test 8/6 23:45 0+00:00:00 I 0 0.0 Sixtrack
27.0 test 8/9 14:48 0+00:00:00 I 0 0.0 Sixtrack
The gahp requests an update as can be seen in the GridmanagerLog.
BOINC_QUERY_BATCHES 4 0 9
condor#boinc-submitter.cern.ch#Sixtrack#1470046242
condor#boinc-submitter.cern.ch#Sixtrack#1470049845
condor#boinc-submitter.cern.ch#Sixtrack#1470746903
condor#boinc-submitter.cern.ch#Sixtrack#1470042521
condor#boinc-submitter.cern.ch#Sixtrack#1470049582
condor#boinc-submitter.cern.ch#Sixtrack#1470126076
condor#boinc-submitter.cern.ch#Sixtrack#1470519953
condor#boinc-submitter.cern.ch#Sixtrack#1470049378
condor#boinc-submitter.cern.ch#Sixtrack#1470089006
And we do get the following response which seems correct.
08/09/16 16:37:17 [157885] GAHP[157887] -> '4' 'NULL'
'1470753439.291700' '1'
'condor#boinc-submitter.cern.ch#Sixtrack#1470046242#18.0' 'ERROR' '1'
'condor#boinc-submitter.cern.ch#Sixtrack#1470049845#21.0' 'ERROR' '1'
'condor#boinc-submitter.cern.ch#Sixtrack#1470746903#27.0' 'IN_PROGRESS'
'1' 'condor#boinc-submitter.cern.ch#Sixtrack#1470042521#17.0' 'ERROR'
'1' 'condor#boinc-submitter.cern.ch#Sixtrack#1470049582#20.0' 'ERROR'
'1' 'condor#boinc-submitter.cern.ch#Sixtrack#1470126076#23.0' 'ERROR'
'1' 'condor#boinc-submitter.cern.ch#Sixtrack#1470519953#26.0'
'IN_PROGRESS' '1'
'condor#boinc-submitter.cern.ch#Sixtrack#1470049378#19.0' 'ERROR' '1'
'condor#boinc-submitter.cern.ch#Sixtrack#1470089006#22.0' 'ERROR'
But after this BOINC_QUERY_BATCHES 14 etc. include a timestamp where
before theres was 0.
08/09/16 19:32:27 [158462] GAHP[158465] <- 'BOINC_QUERY_BATCHES 14
1470763888.431700 9 condor#boinc-submitter.cern.ch#Sixtrack#1470046242
condor#boinc-submitter.cern.ch#Sixtrack#1470049845
condor#boinc-submitter.cern.ch#Sixtrack#1470746903
condor#boinc-submitter.cern.ch#Sixtrack#1470042521
condor#boinc-submitter.cern.ch#Sixtrack#1470049582
condor#boinc-submitter.cern.ch#Sixtrack#1470126076
condor#boinc-submitter.cern.ch#Sixtrack#1470519953
condor#boinc-submitter.cern.ch#Sixtrack#1470049378
condor#boinc-submitter.cern.ch#Sixtrack#1470089006'
And the following is returned.
08/09/16 16:39:18 [157885] GAHP[157887] -> '14' 'NULL'
'1470753559.707100' '0' '0' '0' '0' '0' '0' '0' '0' '0'
The difference in the XML summited is:
<min_mod_time>0.000000</min_mod_time><min_mod_time>1470763888.431700</min_mod_time>
And the returned XML respectively are:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<query_batch2>
<server_time>1470764185.2719</server_time>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470042521#17.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470046242#18.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470049378#19.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470049582#20.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470049845#21.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470089006#22.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470126076#23.0</job_name>
<status>ERROR</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470169805#24.0</job_name>
<status>DONE</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470314864#25.0</job_name>
<status>DONE</status>
</job>
<batch_size>1</batch_size>
<job>
<job_name>condor#boinc-submitter.cern.ch#Sixtrack#1470519953#26.0</job_name>
<status>IN_PROGRESS</status>
</job>
</query_batch2>
and
<?xml version="1.0" encoding="ISO-8859-1" ?>
<query_batch2>
<server_time>1470764518.9648</server_time>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
<batch_size>0</batch_size>
</query_batch2>