[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Job requirements not satisfied even when Requirements = TRUE



Hi Mark:

On Wed, 2011-08-31 at 20:14 -0700, Mark Cafaro wrote:
> David,
> 
> If I understand you correctly, where our issues differ is that this
> occurs intermittently 
> for you but occur constantly for us. I can't get a single job to run.

You may be correct, but my hunch is that we are experiencing the same
issue. Our pool has 1,000+ slots, so an intermittent problem with one
node would likely manifest much differently if you had a smaller pool.

> -Mark
> 
> 
> On Aug 31, 2011, at 8:05 PM, David J. Herzfeld wrote:
> 
> > Hi Garrett:
> > 
> > On Thu, 2011-09-01 at 02:45 +0000, Koller, Garrett wrote:
> >> Mr. Cafaro,
> >> 
> >> I'm confused.  I thought the problem was that the job kept being
> >> rejected with the error "Job requirements not satisfied."  
> > 
> > While I will not speak for Mark, I can speak for the issues that I have
> > encountered (which appears to be at least superficially similar). Yes,
> > the error you quoted is correct. To be clear -
> > 
> > This happens after a successful negotiation and match with an available
> > startd (i.e. the job requirements and machine start expression class ads
> > match). The second requirements check, which happens on the execute
> > machine, fails with "Job requirements not satisfied" (the error shows up
> > in the startd log without ever spawning a starter) - this is not a
> > negotiator error, so a condor_q -analyze would not help.
> > 
> >> If that is so, how could it be matched in the MatchLog?  Was it just
> >> considered in the MatchLog or was it actually assigned to a specific
> >> slot on a specific computer?  If the MatchLog says it found a proper
> >> match and actually assigned it to that computer, check out
> >> http://servo.cs.wlu.edu/dokuwiki/doku.php/condor/submit/troubleshoot
> >> for a possible reason and solution to this problem.
> > 
> > The machines are matched correctly, but the initial execution of the job
> > executable by the starter never occurs, so I don't believe the
> > information in this page is relevant to this issue. Thanks for the
> > suggestion in any case - this clarification would likely be important to
> > any condor developers looking at this issue.
> > 
> > DJH
> > 
> > _______________________________________________
> > Condor-users mailing list
> > To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> > subject: Unsubscribe
> > You can also unsubscribe by visiting
> > https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> > 
> > The archives can be found at:
> > https://lists.cs.wisc.edu/archive/condor-users/
> 
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/