Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Condor-G and GT4
Thanks Dan , that helped but still not able to solve it
$>condor_q -held gives
Submitter: advaitha.ad.infosys.com : <172.25.243.135:33879> :
advaitha.ad.infosys.com
ID OWNER HELD_SINCE HOLD_REASON
17.0 digz 12/15 20:24 Failed to acquire proxy
1 jobs; 0 idle, 0 running, 1 held
I am able to do all normal operations on the grid using this username
digz, so its probably not a proxy issue, on googling I found that
/tmp/Gridmanager.$(USERNAME) should help but it my temp only
Gridmanager.condor is created..there is no Gridmanager.digz,based on
whatever I cud make out I started condor_c-gahp but that didn't help
either,could not see anything in config file about PROXY or GRIDMANAGER
Any Pointers
Digz
-----Original Message-----
From: condor-users-bounces@xxxxxxxxxxx
[mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Dan Bradley
Sent: Thursday, December 15, 2005 10:02 PM
To: Condor-Users Mail List
Subject: Re: [Condor-users] Condor-G and GT4
What reason is given for the job going on hold? You can find out by
running 'condor_q -held' or 'condor_q -l'.
--Dan
Digvijoy Chatterjee wrote:
> Hi List,
>
> I have 51 globus-wsrf-services running on 4 linux boxes [one of them
> is an IA-64 others 3 are i686(INTEL in condor parlance)] at port
> 8443,(all GSI etc is configure):
>
> I did this:
>
> $make gt4-gram-condor
> $make install
> $ $GLOBUS_LOCATION/setup/globus/setup-globus-job-manager-condor
> $ $GLOBUS_LOCATION/setup/globus/setup-globus-scheduler-provider-condor
>
> THE CONTAINER ON STARTING SPAWNED TWO PROCESSES LIKE:
>
> /usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s
> fork -t 1134561543
> /usr/local/globus-4.0.1/libexec/globus-scheduler-event-generator -s
> condor -t 1134561770
>
> What else is needed to configure Condor-G as we are not able to submit
> jobs for example:
>
> when we are trying to submit a simple shell script like>
>
------------------------------------------------------------------------
------
> #!/bin/bash
> hostname
>
------------------------------------------------------------------------
------
> with the submit file:
> Universe = grid
> Grid_Type = gt4
> Jobmanager_Type = Fork
> GlobusScheduler = https://advaitha:8443( <https://advaitha:8443> this
> <https://advaitha:8443> is the IA64 box)
> Executable = myhostname
> Output = job.output
> Error = job.error
> Log = job.log
> Queue
>
> the job is held in "H"state and is never run even though the IA64
> machine is free
>
> here is a snippet of the SchedLog file:
>
> 6228 12/15 17:37:01 (pid:18784) warning: setting UserUid to 99, was
> 520 previosly
> 6229 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
> 6230 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
> 6231 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21180
> status=0 owner=condor
> 6232 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21181
> status=0 owner=null
> 6233 12/15 17:37:02 (pid:18784) IO: Failed to read packet header
> 6234 12/15 17:37:02 (pid:18784) condor_gridmanager exited pid=21182
> status=0 owner=vinodh
> 6235 12/15 17:37:14 (pid:18784) Received HTTP POST connection from
> <172.25.243.135:18137>
> 6236 12/15 17:37:14 (pid:18784) About to serve HTTP request...
> 6237 12/15 17:37:14 (pid:18784) Completed servicing HTTP request
> 6238 12/15 17:37:16 (pid:18784) Received HTTP POST connection from
> <172.25.243.135:18139>
> 6239 12/15 17:37:16 (pid:18784) About to serve HTTP request...
> 6240 12/15 17:37:16 (pid:18784) Completed servicing HTTP request
> 6241 12/15 17:37:19 (pid:18784) Received HTTP POST connection from
> <172.25.243.135:18144>
> 6242 12/15 17:37:19 (pid:18784) About to serve HTTP request...
> 6243 12/15 17:37:19 (pid:18784) Completed servicing HTTP request
> 6244 12/15 17:41:26 (pid:18784) IO: Failed to read packet header
> 6245 12/15 17:41:40 (pid:18784) DaemonCore: Command received via
> TCP from host <172.25.243.135:18371>
> 6246 12/15 17:41:40 (pid:18784) DaemonCore: received command 478
> (ACT_ON_JOBS), calling handler (actOnJobs)
> 6247 12/15 17:41:43 (pid:18784) IO: Failed to read packet header
> 6248 12/15 17:41:53 (pid:18784) Sent ad to central manager for
> nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx>
> 6249 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
> nobody@xxxxxxxxxxxxxxxxxxxxxxx <mailto:nobody@xxxxxxxxxxxxxxxxxxxxxxx>
> 6250 12/15 17:41:53 (pid:18784) Sent ad to central manager for
> condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx>
> 6251 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
> condor@xxxxxxxxxxxxxxxxxxxxxxx <mailto:condor@xxxxxxxxxxxxxxxxxxxxxxx>
> 6252 12/15 17:41:53 (pid:18784) Sent ad to central manager for
> null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx>
> 6253 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
> null@xxxxxxxxxxxxxxxxxxxxxxx <mailto:null@xxxxxxxxxxxxxxxxxxxxxxx>
> 6254 12/15 17:41:53 (pid:18784) Sent ad to central manager for
> vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx>
> 6255 12/15 17:41:53 (pid:18784) Sent ad to 1 collectors for
> vinodh@xxxxxxxxxxxxxxxxxxxxxxx <mailto:vinodh@xxxxxxxxxxxxxxxxxxxxxxx>
> 6256 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
> nobody pid=21690
> 6257 12/15 17:41:53 (pid:18784) warning: setting UserUid to 522,
> was 99 previosly
> 6258 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
> condor pid=21691
> 6259 12/15 17:41:53 (pid:18784) warning: setting UserUid to 525,
> was 522 previosly
> 6260 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
> null pid=21692
> 6261 12/15 17:41:53 (pid:18784) warning: setting UserUid to 520,
> was 525 previosly
> 6262 12/15 17:41:53 (pid:18784) Started condor_gmanager for owner
> vinodh pid=21693
> 6263 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
> 6264 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
> 6265 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
> 6266 12/15 17:41:56 (pid:18784) IO: Failed to read packet header
> 6267 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
> 6268 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21690
> status=0 owner=nobody
> 6269 12/15 17:42:01 (pid:18784) warning: setting UserUid to 99, was
> 520 previosly
> 6270 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
> 6271 12/15 17:42:01 (pid:18784) IO: Failed to read packet header
> 6272 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21691
> status=0 owner=condor
> 6273 12/15 17:42:01 (pid:18784) condor_gridmanager exited pid=21692
> status=0 owner=null
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely for the use of the addressee(s). If you are not the intended
> recipient, please notify the sender by e-mail and delete the original
> message. Further, you are not to copy, disclose, or distribute this
> e-mail or its contents to any other person and any such actions are
> unlawful. This e-mail may contain viruses. Infosys has taken every
> reasonable precaution to minimize this risk, but is not liable for any
> damage you may sustain as a result of any virus in this e-mail. You
> should carry out your own virus checks before opening the e-mail or
> attachment. Infosys reserves the right to monitor and review the
> content of all messages sent to or from this e-mail address. Messages
> sent to or from this e-mail address may be stored on the Infosys
> e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
>-----------------------------------------------------------------------
-
>
>_______________________________________________
>Condor-users mailing list
>Condor-users@xxxxxxxxxxx
>https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users