Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Condor-users] negotiator error?
- Date: Tue, 18 Sep 2012 16:21:39 +0200
- From: "FARKAS, Illes" <fij@xxxxxxx>
- Subject: [Condor-users] negotiator error?
Hello,
One of the computers in a cluster seems to be not accepting condor jobs. Thanks in advance for any feedback.
In a small cluster we have 3 computers (1,32,32 CPUs) with
$CondorVersion: 7.2.4 Apr 11 2010 $
$CondorPlatform: X86_64-LINUX_DEBIAN_UNKNOWN $
Linux 2.6.32-41-server #94-Ubuntu SMP Fri Jul 6 18:15:07 UTC 2012 x86_64 GNU/Linux
and 1 computer (32 CPUs) with
$CondorVersion: 7.6.7 Apr 28 2012 BuildID: 422155 $
$CondorPlatform: x86_64_deb_6.0-updated $
Linux 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
The condor jobs submitted on one of the first three computers do not run on the last (let's call it the 4th computer).
Results for the "better_analyze" switch look normal, but condor "can see" only 65 of the total 97 CPUs:
24283.000: Run analysis summary. Of 65 machines,
0 are rejected by your job's requirements
47 reject your job because of their own requirements
18 match but are serving users with a better priority in the pool
0 match but reject the job for unknown reasons
0 match but will not currently preempt their existing job
0 are available to run your job
No successful match recorded.
Last failed match: Tue Sep 18 16:11:57 2012
Reason for last match failure: no match found
On the 4th computer the log/condor/NegotiatorLog file looks (at least to me) also normal. As an example, this is the last negotiation cycle from that file. It is odd that the "better_analyze" list of the 3rd computer showed the last failed match to be at 4.11pm, the last negotiation was logged on the 4th computer 3 mins later (at 4.14pm), but still no condor job is running on the 4th computer.
09/18/12 16:14:38 ---------- Started Negotiation Cycle ----------
09/18/12 16:14:38 Phase 1: Obtaining ads from collector ...
09/18/12 16:14:38 Getting all public ads ...
09/18/12 16:14:38 Sorting 36 ads ...
09/18/12 16:14:38 Getting startd private ads ...
09/18/12 16:14:38 Got ads: 36 public and 32 private
09/18/12 16:14:38 Public ads include 0 submitter, 32 startd
09/18/12 16:14:38 Phase 2: Performing accounting ...
09/18/12 16:14:38 Phase 3: Sorting submitter ads by priority ...
09/18/12 16:14:38 Phase 4.1: Negotiating with schedds ...
09/18/12 16:14:38 negotiateWithGroup resources used scheddAds length 0
09/18/12 16:14:38 ---------- Finished Negotiation Cycle ----------
Thanks for any suggestions.
Best
Illes