Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Tracing why nodes reject jobs?
- Date: Fri, 16 Jul 2010 11:10:03 +0100
- From: Ian Cottam <ian.cottam@xxxxxxxxxxxxxxxx>
- Subject: Re: [Condor-users] Tracing why nodes reject jobs?
Is it memory? Try adding the Requirement Memory > 0
and also
Rank = Memory
-Ian
[currently out of office]
On 15 Jul 2010, at 16:52, "Jonathan D. Proulx" <jon@xxxxxxxxxxxxx> wrote:
> Hi All,
>
> I have a user who queued a couple hundred identical Standard Universe
> jobs (well the parameters were a little different but the class ads
> were the same), most completed but 15 are hanging aroundin idle state
> after having accumulated some runtime, but will no longer match any
> execute nodes:
>
> ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
> ---
> 78745.005: Run analysis summary. Of 429 machines,
> 19 are rejected by your job's requirements
> 410 reject your job because of their own requirements
> 0 match but are serving users with a better priority
> in the pool
> 0 match but reject the job for unknown reasons
> 0 match but will not currently preempt their existing job
> 0 match but are currently offline
> ) are available to run your job
> Last successful match: Tue Jul 617:33:30 2010
> Last failed match: Thu Jul 15 11:46:30 2010
> Reason for last match failure: no match found
>
> The 19 rejected for Job requirements are clear (wrong ARCH), the 410
> for node rrequirements is odd in several ways:
>
> 1) there are 410 total systems available and 344 are currently claimed
> so I'd expect those to be either "match but are serving users with a
> better priority in the pool" or "match but will not currently preempt
> their existing job"
>
> 2) clearly they used to match some or the job wouldn't have had runtime
>
> Where/how can I see why a specific node rejects a given job?
>
> -Jon
> _______________________________________________
> Condor-users mailing list
> To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/condor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/condor-users/