Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] condor_status and condor_q disagree about state of vm's
- Date: Fri, 20 Apr 2007 13:49:54 -0400
- From: Bob Kinney <bkinney@xxxxxxxxxxxxxxxx>
- Subject: [Condor-users] condor_status and condor_q disagree about state of vm's
Hi:
I've spent the last couple of days looking for an answer to this issue
and searched the archives, but came up empty handed. If this has been
addressed before please excuse the rehash.
I've got a small pool of two SMP machines, both with dual dual-core
Opteron processors. In the default configuration that's 8 vm's. I
would expect that this would mean that I should never be able to have
more than 8 jobs running in this pool at any given time, but I have been
able to do just that.
For (as of yet) undetermined reasons, the schedd will not recognize that
a startd is running for on some vms. See below the (trimmed) results of
a condor_status:
Name OpSys Arch State Activity
vm1@server-1 LINUX X86_64 Unclaimed Idle
vm2@server-1 LINUX X86_64 Unclaimed Idle
vm3@server-1 LINUX X86_64 Claimed Busy
vm4@server-1 LINUX X86_64 Unclaimed Idle
vm1@server-2 LINUX X86_64 Unclaimed Idle
vm2@server-2 LINUX X86_64 Unclaimed Idle
vm3@server-2 LINUX X86_64 Claimed Busy
vm4@server-2 LINUX X86_64 Claimed Busy
Now look at the (trimmed) results of a condor_q -running:
ID HOST(S)
68.0 vm4@server-1
69.0 vm4@server-2
70.0 vm3@server-1
71.0 vm3@server-2
notice that vm4 on server-1 is running a job, but shows up as
Unclaimed/Idle. Does anyone have an explanation of why this might
happen or what I can do to further debug the issue?
Some other information that might be relevant:
* server-1 is the central manager for this pool and runs a schedd
* jobs are remotely submitted from other hosts to the schedd on server-1
* server-2 does not seem to have the same issue (i.e. condor_status
always reports the correct results).
* if other jobs are submitted to run on server-1 the vm's that will
report Claimed/Busy will change (i.e. vm3 will be Idle, vm4 will be Busy).
Thanks in advance to any assistance anyone can offer.
Regards,
Bob
--
Earl (Bob) Kinney
UNIX Systems Administrator
Harvard-MIT Data Center