Hi That command does not return anything but a
blank line.
thanks again
Chris
----- Original Message -----
Sent: Tuesday, December 06, 2005 2:58
PM
Subject: RE: [Condor-users] Problems with
jobs
Good idea stopping
the startd process. The less you have running on the machine the easier this
is to solve. On all the machines reporting Claimed + Idle -- which schedd in
your system has the claim? Try:
condor_status -const
'State == "Claimed" && Activity == "Idle"' -f ?%s\n?
GlobalJobId
The GlobalJobId
identifies the schedd the job came from -- is it the same schedd that?s
running three jobs? Or is it a different schedd in your system that?s maybe
hung up and hanging on to those machines?
-
Ian
From: Chris
Miles [mailto:chrismiles@xxxxxxxxxxxxxxxx] Sent: December 5, 2005 7:19
PM To: Ian Chesal Cc: Condor-Users Mail List Subject: Re: [Condor-users] Problems with
jobs
I have removed the STARTD daemon
from my central manager to try improve
performance. and have also set
TESTINGMODE in the config file.
Still only 3 vms actually running
jobs still.
----- Original Message -----
Sent: Monday,
December 05, 2005 5:03 PM
Subject: RE:
[Condor-users] Problems with jobs
Ahh, but do you
only have the one schedd? It looks like you have three jobs running (Claimed
+ Busy) according to your output -- are they all from the same schedd? Maybe
there?s another schedd in your system that?s not able to respond to it?s
claims in time?
-
Ian
From: Chris
Miles [mailto:chrismiles@xxxxxxxxxxxxxxxx] Sent: December 5, 2005 11:55
AM To: Ian
Chesal Cc: Condor-Users
Mail List Subject: Re:
[Condor-users] Problems with jobs
Its a meaty machine. 2gb of
memory and 4Ghz CPU (64 bit) its an IBM built
cluster
and doesnt have any resource
problem that I can see. there isnt much else working on
it.
----- Original Message -----
Sent:
Monday, December 05, 2005 4:26 PM
Subject: RE:
[Condor-users] Problems with jobs
The dreaded
Claimed+Idle. It generally happens to us when our schedd can?t keep up
with the processing required to start our jobs. Check the resources on
your schedd machine: can your machine handle spawing all the necessary
shadows? Or is it running out of CPU, memory, disk, etc?
-
Ian
From:
Chris Miles [mailto:chrismiles@xxxxxxxxxxxxxxxx] Sent: December 5, 2005 11:11
AM To: Condor-Users Mail
List; Ian Chesal Subject:
Re: [Condor-users] Problems with jobs
Hi Thanks for the
response.
SUBMIT_SEND_RESCHEDULE has
not specified in any of my config files which
means that its automatically
set to true does it not?
condor_q -ana says jobs being
serviced.
It seems a lot of machines go
into the claimed state but stay idle.
----- Original Message -----
Sent:
Monday, December 05, 2005 2:39 PM
Subject: Re:
[Condor-users] Problems with jobs
When im submitting jobs into
my pool it seems to take ages to start running
the
jobs unless i run
condor_reschedule. Is there a way to speed the process up
without
My second problem is that
job results are not returning to me any quicker than If i ran
my jobs one a one machine
pool. I.e im checking condor_q and the queue is only
going
down 1 at a time at roughly
the same speed as if there was only one machine in that
pool.
It is also slower than if I
actually ran my jobs sequentially on one machine using a batch
file or shell script.
[Ian Chesal]
What does condor_q -ana say? Are you setting your job requirements such
that only one VM in the system is able to match with all your jobs in
your cluster? What about the MAX_JOBS_RUNNING setting on your schedd?
Make sure that isn?t set to 1.
_______________________________________________ Condor-users
mailing
list Condor-users@xxxxxxxxxxx https://lists.cs.wisc.edu/mailman/listinfo/condor-users
|