[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Flocking drawback



On Mon, Sep 19, 2005 at 11:26:05PM -0500, Thomas Materna wrote:
> Hi,
> I have major problems with flocking. I have a pool A of 3 computers sharing
> a filesystem. I have a pool B of 20 computers not sharing the same file
> system as the pool A. A flocks to B. I have a bunch of jobs submitted from A
> in standard universe but I have a very bad priority since I've been doing
> that a lot lately. Another user also has whole bunch of jobs submitted to A.
> But his are in vanilla universe, he added in his submit file a requirement
> of the type
> ((Machine==A1) || (Machine==A2)...) where A1, A2 are the machines in the
> pool A. 
>  
> Well, he will never run on pool B, but he prevents me from running on it!!!!
> What happens is that at every cycle, having a better priority, he claims all
> the machines in pool B, my jobs can hence not do so. Only then the jobs
> reject the machines for not meeting the requirement. I have 20 machines
> doing nothing!
>  
> How can I get around that? Is there a way to avoid the jobs claiming machine
> they won't accept to run on anyway? If not, I consider it a major flaw.

Condor will not match jobs with machines that do not meet the requirements
of the job. Can you give an example of a job (condor_q -l jobid) and a
machine (condor_status -name <machinename> -l) that matched that shouldn't 
have? 

-Erik