Re: [Condor-users] about condor-g to globus to condor problem


Date: Fri, 11 Feb 2005 18:03:40 +0800 (HKT)
From: "Carson Hung" <carson@xxxxxxxxxxxxxx>
Subject: Re: [Condor-users] about condor-g to globus to condor problem
Hi,

I have tried removing the globusrsl command, but after submission, the job
seems to state there without running. There is no special error msg for
checking GridmanagerLog.kshung file.

2/11 17:57:54 [4178] Found job 127.0 --- inserting
2/11 17:57:54 [4178] ***Trying job type Mirror
2/11 17:57:54 [4178] ***Trying job type INFNBatch
2/11 17:57:54 [4178] ***Trying job type Condor
2/11 17:57:54 [4178] ***Trying job type GT3
2/11 17:57:54 [4178] ***Trying job type Globus
2/11 17:57:54 [4178] Using job type Globus for job 128.0
2/11 17:57:54 [4178] Found job 128.0 --- inserting
2/11 17:57:54 [4178] Fetched 2 new job ads from schedd
2/11 17:57:54 [4178] querying for removed/held jobs
2/11 17:57:54 [4178] Using constraint ((Owner=?="kshung"&&JobUniverse==9))
&& (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?=
TRUE))
2/11 17:57:54 [4178] Fetched 0 job ads from schedd
2/11 17:57:54 [4178] leaving doContactSchedd()
2/11 17:57:54 [4178] (128.0) doEvaluateState called: gmState GM_INIT,
globusState 32
2/11 17:57:54 [4178] GAHP server pid = 4179
2/11 17:57:54 [4178] GAHP server version: $GahpVersion: 1.0.12 Dec 28 2004
UW Gahp $
2/11 17:57:54 [4178] GAHP[4179] <- 'COMMANDS'
2/11 17:57:54 [4178] GAHP[4179] -> 'S' 'COMMANDS' 'GASS_SERVER_INIT'
'GRAM_CALLBACK_ALLOW' 'GRAM_ERROR_STRING' 'GRAM_JOB_CALLBACK_REGISTER'
'GRAM_JOB_CANCEL' 'GRAM_JOB_REQUEST' 'GRAM_JOB_SIGNAL' 'GRAM_JOB_STATUS'
'GRAM_PING' 'INITIALIZE_FROM_FILE' 'QUIT' 'RESULTS' 'VERSION'
'ASYNC_MODE_ON' 'ASYNC_MODE_OFF' 'RESPONSE_PREFIX'
'REFRESH_PROXY_FROM_FILE' 'CACHE_PROXY_FROM_FILE' 'USE_CACHED_PROXY'
'UNCACHE_PROXY' 'GRAM_JOB_REFRESH_PROXY' ''
2/11 17:57:54 [4178] GAHP[4179] <- 'RESPONSE_PREFIX GAHP:'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'ASYNC_MODE_ON'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'INITIALIZE_FROM_FILE
/tmp/condor_g_scratch.0x83e03f8.2782/master_proxy.2'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'CACHE_PROXY_FROM_FILE 2
/tmp/condor_g_scratch.0x83e03f8.2782/master_proxy.2'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'CACHE_PROXY_FROM_FILE 1 /tmp/x509up_u502'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'GRAM_CALLBACK_ALLOW 2 0'
2/11 17:57:54 [4178] GAHP[4179] -> 'S' 'https://pc-0242.eee.hku.hk:35693/'
2/11 17:57:54 [4178] GAHP[4179] <- 'GASS_SERVER_INIT 3 0'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'RESULTS'
2/11 17:57:54 [4178] GAHP[4179] -> 'S' '1'
2/11 17:57:54 [4178] GAHP[4179] -> '3' '0' 'https://pc-0242.eee.hku.hk:35694'
2/11 17:57:54 [4178] (128.0) gm state change: GM_INIT -> GM_START
2/11 17:57:54 [4178] (128.0) gm state change: GM_START -> GM_CLEAR_REQUEST
2/11 17:57:54 [4178] (128.0) gm state change: GM_CLEAR_REQUEST ->
GM_UNSUBMITTED2/11 17:57:54 [4178] (128.0) gm state change: GM_UNSUBMITTED
-> GM_SUBMIT
2/11 17:57:54 [4178] GRIDMANAGER_GLOBUS_COMMIT_TIMEOUT is undefined, using
default value of 600
2/11 17:57:54 [4178] Final RSL: &(rsl_substitution=(GRIDMANAGER_GASS_URL
https://pc-0242.eee.hku.hk:35694))(executable='/bin/ls')(scratchdir='')(directory=$(SCRATCH_DIRECTORY))(proxy_timeout=240)(save_state=yes)(two_phase=600)(remote_io_url=$(GRIDMANAGER_GASS_URL))
2/11 17:57:54 [4178] GAHP[4179] <- 'USE_CACHED_PROXY 1'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'GRAM_JOB_REQUEST 4
pc-0242.eee.hku.hk/jobmanager-condor https://pc-0242.eee.hku.hk:35693/ 1
&(rsl_substitution=(GRIDMANAGER_GASS_URL\
https://pc-0242.eee.hku.hk:35694))(executable='/bin/ls')(scratchdir='')(directory=$(SCRATCH_DIRECTORY))(proxy_timeout=240)(save_state=yes)(two_phase=600)(remote_io_url=$(GRIDMANAGER_GASS_URL))'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] (128.0) doEvaluateState called: gmState GM_SUBMIT,
globusState 32
2/11 17:57:54 [4178] GAHP[4179] <- 'RESULTS'
2/11 17:57:54 [4178] GAHP[4179] -> 'S' '0'
2/11 17:57:54 [4178] (127.0) doEvaluateState called: gmState GM_INIT,
globusState 32
2/11 17:57:54 [4178] (127.0) gm state change: GM_INIT -> GM_START
2/11 17:57:54 [4178] (127.0) gm state change: GM_START -> GM_REGISTER
2/11 17:57:54 [4178] GAHP[4179] <- 'GRAM_JOB_CALLBACK_REGISTER 5
https://pc-0242.eee.hku.hk:35566/4147/1108115531/
https://pc-0242.eee.hku.hk:35693/'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] grid_monitor for pc-0242.eee.hku.hk:2119 entering
CheckMonitor
2/11 17:57:54 [4178] grid_monitor for pc-0242.eee.hku.hk:2119: first ping
not done yet, will retry later
2/11 17:57:54 [4178] GAHP[4179] <- 'USE_CACHED_PROXY 2'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'GRAM_PING 6 pc-0242.eee.hku.hk:2119'
2/11 17:57:54 [4178] GAHP[4179] -> 'S'
2/11 17:57:54 [4178] GAHP[4179] <- 'RESULTS'
2/11 17:57:54 [4178] GAHP[4179] -> 'S' '1'
2/11 17:57:54 [4178] GAHP[4179] -> '5' '0' '0' '32'
2/11 17:57:54 [4178] (127.0) doEvaluateState called: gmState GM_REGISTER,
globusState 32
2/11 17:57:54 [4178] (127.0) gm state change: GM_REGISTER -> GM_STDIO_UPDATE
2/11 17:57:55 [4178] GAHP[4179] <- 'RESULTS'
2/11 17:57:55 [4178] GAHP[4179] -> 'S' '1'
2/11 17:57:55 [4178] GAHP[4179] -> '4' '110'
'https://pc-0242.eee.hku.hk:35718/4182/1108115874/'
2/11 17:57:55 [4178] (128.0) doEvaluateState called: gmState GM_SUBMIT,
globusState 32
2/11 17:57:55 [4178] (128.0) gm state change: GM_SUBMIT -> GM_SUBMIT_SAVE
2/11 17:57:55 [4178] GAHP[4179] <- 'RESULTS'
2/11 17:57:55 [4178] GAHP[4179] -> 'S' '1'
2/11 17:57:55 [4178] GAHP[4179] -> '6' '0'
2/11 17:57:55 [4178] resource pc-0242.eee.hku.hk:2119 is now up
2/11 17:57:55 [4178] (128.0) doEvaluateState called: gmState
GM_SUBMIT_SAVE, globusState 32
2/11 17:57:55 [4178] (127.0) doEvaluateState called: gmState
GM_STDIO_UPDATE, globusState 32
2/11 17:57:55 [4178] GAHP[4179] <- 'GRAM_JOB_SIGNAL 7
https://pc-0242.eee.hku.hk:35566/4147/1108115531/ 7
&(remote_io_url=https://pc-0242.eee.hku.hk:35694)(stdout='https://pc-0242.eee.hku.hk:35694/home/kshung/file6.out')(stdout_position=0)'2/11
17:57:55 [4178] GAHP[4179] -> 'S'
2/11 17:57:59 [4178] in doContactSchedd()
2/11 17:57:59 [4178] GRIDMANAGER_TIMEOUT_MULTIPLIER is undefined, using
default
value of 0
2/11 17:57:59 [4178] SEC_DEBUG_PRINT_KEYS is undefined, using default
value of False
2/11 17:57:59 [4178] AUTHENTICATE_FS: used file /tmp/qmgr_D8YHGz, status: 1
2/11 17:57:59 [4178] querying for removed/held jobs
2/11 17:57:59 [4178] Using constraint ((Owner=?="kshung"&&JobUniverse==9))
&& (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?=
TRUE))
2/11 17:57:59 [4178] Fetched 0 job ads from schedd
2/11 17:57:59 [4178] Updating classad values for 128.0:
2/11 17:57:59 [4178]    GlobusGramVersion = 3
2/11 17:57:59 [4178]    GlobusContactString =
"https://pc-0242.eee.hku.hk:35718/4182/1108115874/";
2/11 17:57:59 [4178] leaving doContactSchedd()
2/11 17:57:59 [4178] (128.0) doEvaluateState called: gmState
GM_SUBMIT_SAVE, globusState 32
2/11 17:57:59 [4178] (128.0) gm state change: GM_SUBMIT_SAVE ->
GM_SUBMIT_COMMIT2/11 17:57:59 [4178] GAHP[4179] <- 'GRAM_JOB_SIGNAL 8
https://pc-0242.eee.hku.hk:35718/4182/1108115874/ 5 NULL'
2/11 17:57:59 [4178] GAHP[4179] -> 'S'
2/11 17:57:59 [4178] grid_monitor for pc-0242.eee.hku.hk:2119 entering
CheckMonitor
2/11 17:57:59 [4178] GRIDMANAGER_MAX_PENDING_REQUESTS is undefined, using
default value of 50


Actually I have tried by sending another submit file like that (i.e. file6
is an executable file output by using condor_compile), the job is being
held.

executable = file6
universe = grid
grid_type=gt2
globusscheduler = pc-0242.eee.hku.hk/jobmanager-condor
globusRSL = (universe =  standard)
output = file6.out
log = file6.log
queue

Thanks,
Carson

>
>
>
> "Carson Hung" <carson@xxxxxxxxxxxxxx> wrote:
> __________
>>Hi,
>>
>>Does anyone notice any problem about this submission file?
>
> yes, get rid of that 'globusRSL' line in your submit file below.  There
> is no 'condor' universe...  Since you are submitting ls, i assume you
> want globus to submit the job as a vanilla universe job.  That should be
> what you will get by default if you simply remove the line from the
> submit file below. Hope this helps you out.
>
> Regards
> Todd
>
> -+-+-+-+-+-+-+-+-+-+-+-+-+-
> Todd Tannenbaum, Condor Project
> Department of Computer Science, Univ of Wisconsin-Madison
>
>
>
> or any setting
>>about condor-g will affect the condor queue after globus gatekeeper?
>>
>>Thanks for any suggestion,
>>Carson
>>
>>> Hi,
>>>
>>> I have tried submitting jobs using condor-g to globus to condor. I
>>> was able to submit jobs by using globus-job-run.
>>>
>>> but when I try using condor-g, it fails.
>>>
>>> the submit file is:
>>> executable = /bin/ls
>>> transfer_executable = false
>>> universe = grid
>>> grid_type=gt2
>>> globusscheduler = pc-0242.eee.hku.hk/jobmanager-condor
>>> globusRSL = (condor_submit =(universe condor))
>>> output = file6.out
>>> log = file6.log
>>> queue
>>>
>>>
>>> Thanks for any suggestions,
>>> carson
>>>
>>>
>>> _______________________________________________
>>> Condor-users mailing list
>>> Condor-users@xxxxxxxxxxx
>>> http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>>
>>
>>
>>_______________________________________________
>>Condor-users mailing list
>>Condor-users@xxxxxxxxxxx
>>http://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
>
> _______________________________________________
> Condor-users mailing list
> Condor-users@xxxxxxxxxxx
> http://lists.cs.wisc.edu/mailman/listinfo/condor-users




[← Prev in Thread] Current Thread [Next in Thread→]