If you’ve eliminated all the Claimed
+ Idle machines in your system and you still don’t get more than three
jobs running simultaneously try running condor_q -analyze again on one of your
clusters to see if the reason the jobs won’t start is. You can paste the
condor_q -analyze output for one job in a cluster back here if you need a hand
with it.
- Ian
From:
condor-users-bounces@xxxxxxxxxxx [mailto:condor-users-bounces@xxxxxxxxxxx] On Behalf Of Chris Miles
Sent: December 6, 2005 10:16 AM
To: Ian Chesal
Cc: Condor-Users Mail List
Subject: Re: [Condor-users]
Problems with jobs
Hi That command does not return anything but a blank line.
----- Original Message -----
Sent: Tuesday, December
06, 2005 2:58 PM
Subject: RE: [Condor-users]
Problems with jobs
Good idea stopping the startd process. The
less you have running on the machine the easier this is to solve. On all the
machines reporting Claimed + Idle -- which schedd in your system has the claim?
Try:
condor_status -const 'State ==
"Claimed" && Activity == "Idle"' -f
“%s\n” GlobalJobId
The GlobalJobId identifies the schedd the
job came from -- is it the same schedd that’s running three jobs? Or is
it a different schedd in your system that’s maybe hung up and hanging on
to those machines?
- Ian
From: Chris Miles
[mailto:chrismiles@xxxxxxxxxxxxxxxx]
Sent: December 5, 2005 7:19 PM
To: Ian Chesal
Cc: Condor-Users Mail List
Subject: Re: [Condor-users]
Problems with jobs
I have removed the STARTD daemon from my central manager to
try improve
performance. and have also set TESTINGMODE in the config
file.
Still only 3 vms actually running jobs still.
----- Original Message -----
Sent: Monday, December
05, 2005 5:03 PM
Subject: RE: [Condor-users]
Problems with jobs
Ahh, but do you only have the one schedd?
It looks like you have three jobs running (Claimed + Busy) according to your
output -- are they all from the same schedd? Maybe there’s another schedd
in your system that’s not able to respond to it’s claims in time?
- Ian
From: Chris Miles
[mailto:chrismiles@xxxxxxxxxxxxxxxx]
Sent: December 5, 2005 11:55 AM
To: Ian Chesal
Cc: Condor-Users Mail List
Subject: Re: [Condor-users]
Problems with jobs
Its a meaty machine. 2gb of memory and 4Ghz CPU (64 bit) its
an IBM built cluster
and doesnt have any resource problem that I can see. there
isnt much else working on it.
----- Original Message -----
Sent: Monday, December
05, 2005 4:26 PM
Subject: RE: [Condor-users]
Problems with jobs
The dreaded Claimed+Idle. It generally
happens to us when our schedd can’t keep up with the processing required
to start our jobs. Check the resources on your schedd machine: can your machine
handle spawing all the necessary shadows? Or is it running out of CPU, memory,
disk, etc?
- Ian
From: Chris Miles
[mailto:chrismiles@xxxxxxxxxxxxxxxx]
Sent: December 5, 2005 11:11 AM
To: Condor-Users Mail List; Ian
Chesal
Subject: Re: [Condor-users]
Problems with jobs
Hi Thanks for the response.
SUBMIT_SEND_RESCHEDULE has not specified in any of my
config files which
means that its automatically set to true does it not?
condor_q -ana says jobs being serviced.
It seems a lot of machines go into the claimed state but
stay idle.
----- Original Message -----
Sent: Monday, December
05, 2005 2:39 PM
Subject: Re: [Condor-users]
Problems with jobs
When im submitting jobs into my pool it seems to take ages
to start running the
jobs unless i run condor_reschedule. Is there a way to speed
the process up without
My second problem is that job results are not returning
to me any quicker than If i ran
my jobs one a one machine pool. I.e im checking condor_q and
the queue is only going
down 1 at a time at roughly the same speed as if there was
only one machine in that pool.
It is also slower than if I actually ran my jobs
sequentially on one machine using a batch
file or shell script.
[Ian Chesal] What does condor_q -ana say?
Are you setting your job requirements such that only one VM in the system is
able to match with all your jobs in your cluster? What about the
MAX_JOBS_RUNNING setting on your schedd? Make sure that isn’t set to 1.
_______________________________________________
Condor-users mailing list
Condor-users@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-users
|