Hi Greg,
Thanks for the quick responses. I think your explanations sound
reasonable. FYI, The command showed either 0 or 2 there.
Thank you again,
Zhuo
Greg Thain wrote on 10/3/2016 11:18 AM:
On 10/03/2016 10:05 AM, Zhuo Zhang
wrote:
Hi,
We have a small test cluster with 5 machines. After
submitting jobs to the test cluster, I always see 5 slots
unclaimed even there are jobs waiting in the queue. I am not
an administrator but a user, just suspect there are something
wrong with our condor configurations. See the experiments I
did below, can someone explain why there are jobs waiting
while the resources are still available? Is the behavior
expected?
3. But condor_status showed there are still 5 slots
unclaimed.
rhw1193:25} condor_status
Name OpSys Arch State Activity LoadAv
Mem ActvtyTime
slot1@xxxxxxxxxxxx
LINUX X86_64 Unclaimed Idle 0.140 146991 4+19:01:06
slot1@xxxxxxxxxxxx
LINUX X86_64 Unclaimed Idle 0.020 146991 32+22:31:04
slot1@xxxxxxxxxxxx
LINUX X86_64 Unclaimed Idle 0.060 146990 32+22:22:23
slot1@xxxxxxxxxxxx
LINUX X86_64 Unclaimed Idle 0.830 146991 4+18:58:47
slot1@xxxxxxxxxxxx
LINUX X86_64 Unclaimed Idle 0.300 136879 18+17:51:51
If these five slots are the only Unclaimed ones, then your
configuration is probably OK. These slots are probably
partitionable slots, which means that they hold the "leftover"
resources (cpu, memory, disk, etc.) that any running dynamic slots
under them (the slot1_xx slots) are using. If you run
condor_status -af Name Cpus
condor will tell you how many cpus are in each slot, and I bet
that the partitionable slots have exhausted their cpus, and there
are 0 there.
-greg
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/
|