Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] Scheduling problem
- Date: Fri, 07 Apr 2006 17:35:22 +0200
- From: lrodrig <luisr@xxxxxxxxxxxxxxx>
- Subject: [Condor-users] Scheduling problem
Hello,
I've installed the condor version 6.1.11. By now, I'm working with just
two nodes: a central manager (node01) and a node to run the jobs
(node03).
The problem appears when I try to submmit jobs from the central manager.
No job is scheduled in the node03, all are rejected. If I set the START
attribute in the central manager config file to FALSE (in order to force
jobs to be executed in the node03) no job is runned at all.
When I start the condor system in both nodes all seems right in the log
files, except:
WARNING: No master ad for < vm2@node03 >
9/15 20:20:45 StartdAd : Inserting ** "< vm2@node03 , 192.168.1.3 >"
9/15 20:20:45 stats: Inserting new hashent for
'Start':'vm2@node03':'192.168.1.3'
I get this message for every cpu in the node. I also get this message
for the cpus in node01 (central manager) but this node can accept jobs.
When I submit a job, I get this message in the SchedLog:
9/15 20:37:26 Tables are consistent
9/15 20:37:26 Out of servers - 0 jobs matched, 1 jobs idle, 1 jobs
rejected
9/15 20:39:05 IO: Failed to read packet header
The result of condor_status is (all the time):
Name OpSys Arch State Activity LoadAv Mem
ActvtyTime
vm1@node01 LINUX INTEL Owner Idle 0.000 252 0
+00:18:10
vm2@node01 LINUX INTEL Owner Idle 0.000 252 0
+00:18:10
vm3@node01 LINUX INTEL Owner Idle 0.000 252 0
+00:18:10
vm4@node01 LINUX INTEL Owner Idle 0.000 252 0
+00:18:10
vm1@node03 LINUX INTEL Unclaimed Idle 0.000 252 0
+00:17:06
vm2@node03 LINUX INTEL Unclaimed Idle 0.000 252 0
+00:17:06
vm3@node03 LINUX INTEL Unclaimed Idle 0.000 252 0
+00:17:06
vm4@node03 LINUX INTEL Unclaimed Idle 0.000 252 0
+00:17:06
Machines Owner Claimed Unclaimed Matched Preempting
INTEL/LINUX 8 4 0 4 0 0
Total 8 4 0 4 0 0
I've checked the manual but I'm not able to find the problem. Does
anyone know where the problem can be?
Thanks in advance