[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Negotiator problem? Jobs not assigned to idlemachines.



At 12:09 PM 8/1/2006, Rick Lan wrote:
Hi,

Setting NEGOTIATOR_CONSIDER_PREEMPTION = True seems to work. However, at
first jobs would begin to run, then some of the jobs would get stuck as
"match but reject the job for unknown reasons" for about 15mins and then
start running. Now it is stuck for 2 hours. I've attach SchedLog and
NegotiatorLog below.

8/1 22:06:02       Rejected 93.0 malikr@xxxxxxxx <172.26.30.23:3179>: no
match found

Above line is strange in that previous jobs have identical submit file
except file paths.
Obvious question, but you have (had?) "Unclaimed" machines in your 
pool according to condor_status?
Try doing "condor_status -state" and see how long these Unclaimed 
machines have been Unclaimed (by looking at the StateTime 
column).  Perhaps these machines are being claimed and run jobs, but 
then immediately toss the job off?  Thus whenever you look, you 
typically see the machine Unclaimed and the job idle?  This could 
happen if, for example, the stdin file specified does not exist or 
something like that.
-Todd



-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Todd Tannenbaum                       University of Wisconsin-Madison
Condor Project Research               Department of Computer Sciences
tannenba@xxxxxxxxxxx                  1210 W. Dayton St. Rm #4257
http://www.cs.wisc.edu/~tannenba      Madison, WI 53706-1685
Phone: (608) 263-7132  FAX: (608) 262-9777