Subject: [Condor-users] Jobs blocked as Idle in Multi-CPU machine
Hi, All:
I have two machines, nodeA contains 2 CPU, nodeB contains 1 CPU, here is the cpu information: _______________________
ye@nodea:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel
cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz stepping : 2 cpu MHz : 1596.000 cache size : 2048 KB ... ... bogomips :
4265.69 clflush size : 64
processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz stepping : 2
cpu MHz : 1596.000 cache size : 2048 KB ... ... bogomips : 4262.73 clflush size : 64
_______________________
ye@nodeb:~$ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel
cpu family : 15 model : 1 model name : Intel(R) Pentium(R) 4 CPU 1.70GHz ... ... bogomips : 3393.42 clflush size : 64 _______________________
I fellow Condor's(
6.8.4) tutorial(http://www.cs.wisc.edu/condor/tutorials/intl-grid-school-3/) as my beginning, for the step of "Submitting your first Condor job", I find all the job submitted in nodeA are blocked as idle:
_______________________ ye@nodea:~$ condor_q
1 jobs; 1 idle, 0 running, 0 held _______________________
But when I submit the same job in nodeB, it works perfectly. In this case, I checked the condor status, the following is the feedback: _______________________
ye@nodea:~$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
vm1@xxxxxxxxx LINUX INTEL Unclaimed Idle 0.000 1000 0+03:45:04 vm2@xxxxxxxxx LINUX INTEL Unclaimed Idle 0.000 1000 0+03:45:05
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/LINUX 2 0 0 2 0 0 0
Total 2 0 0 2 0 0 0
_______________________
ye@nodeb:~$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
nodeb.gridgro LINUX INTEL Unclaimed Idle 0.000 1011 0+03:09:53
Total Owner Claimed Unclaimed Matched Preempting Backfill
INTEL/LINUX 1 0 0 1 0 0 0
Total 1 0 0 1 0 0 0
_______________________
I don't know whether it's caused by nodeA contains 2 CPU, so the jobs in nodeA is blocked because they don't know where to execute? And how could I fix this problem upon nodeA(multi-processes)?