Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Not running Parallel-universe jobs?
- Date: Tue, 25 Aug 2015 16:58:37 +0000
- From: "Seering, Adam" <aseering@xxxxxx>
- Subject: Re: [HTCondor-users] Not running Parallel-universe jobs?
Thanks!
How would I go about finding machines that are in this state, though?
For example (we only have one dedicated scheduler):
"""
$ condor_status -constraint 'DedicatedScheduler =!= "DedicatedScheduler@<hostname>"'
$
"""
The command produces no output; I assume that means no machines are
found. (If I change that to "==", or I change the string to something
that's not our scheduler, then it prints out all machines in the
cluster.)
Adam
On Tue, 2015-08-25 at 11:20 -0400, Michael V Pelletier wrote:
> The -analyze and -better-analyze options still show that machines
> which don't have the DedicatedScheduler attribute set as "available"
> to run a parallel universe job:
>
> 008.000: Run analysis summary. Of 6 machines,
> 0 are rejected by your job's requirements
> 0 reject your job because of their own requirements
> 0 match and are already running your jobs
> 0 match but are serving other users
> 6 are available to run your job
>
> This is what shows up when I submit a 3-machine_count parallel job to
> a static-slot pool which only has two slots with DedicatedScheduler
> set. If you set a job requirement of ( !
> isUndefined(DedicatedScheduler) ), or some sort of more sophisticated
> expression to match the dedicated scheduler to which the job was
> submitted, then the analyze will show you a clearer picture:
>
> 009.000: Run analysis summary. Of 6 machines,
> 4 are rejected by your job's requirements
> 0 reject your job because of their own requirements
> 0 match and are already running your jobs
> 0 match but are serving other users
> 2 are available to run your job
>
> (Feature request for 8.2.10?)
>
> Check section 2.9.2 in the 8.2.9 manual for more details about the
> DedicatedScheduler attribute. A parallel job will only run on a slot
> with the DedicatedScheduler attribute - maybe some of the other
> machines lost that in the wake of your recent disruption if you're
> expecting the job to run on the six available machines.
>
> As to the current job you're waiting on, once those 41 slots which are
> running open up, then your job will be dispatched.
>
>
>
>
>
>
> Michael V. Pelletier
> IT Program Execution
> Principal Engineer
> 978.858.9681 (5-9681) NOTE NEW
> NUMBER
> 339.293.9149 cell
> 339.645.8614 fax
> michael.v.pelletier@xxxxxxxxxxxx
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/