Mailing List Archives
	Authenticated access
	
	
     | 
    
	 
	 
     | 
    
	
	 
     | 
  
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Fair-share limits reached while there are whole machines are available and idle jobs
- Date: Tue, 20 Nov 2018 15:47:34 -0600
 
- From: Alec Sheperd <alec.sheperd@xxxxxxxxxxxxxxxx>
 
- Subject: [HTCondor-users] Fair-share limits reached while there are whole machines are available and idle jobs
 
Hello,
I recently noticed something strange with our condor pool. There are a 
lot of idle jobs in the queue and yet there are nearly equally many 
available slots. Whole machines even, where there are no jobs running, 
and yet
none of the idle jobs get allocated one of these empty slots.
After digging around in the negotiator logs and classads, it seems there 
are a lot of jobs that are being rejected based on fair-share limits. 
There are many more rejections happening than matches, and as far as I 
can tell they are due to fair-share limits.
From the LastNegotiationCycleSubmittersShareLimit* classsad, it seems 
like all the ones being rejected are in the list provided from it.
These jobs are all getting submitted from the default <none> group which 
has the surplus flag set. In the negotiator log it displays "Group 
<none> is using its quota 2629 - halting negotiation".
Could it be something wrong with user prio and quotas disallowing slot 
matches? Also wonder if maybe it's related to bug fixed in 8.7.10 
(https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6714) 
(https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=6750)
Thanks for any help or thoughts,
Alec