Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Trying to set-up DedicatedScheduler for parallel universe, but not preempting serial jobs
- Date: Tue, 06 Dec 2016 17:14:30 +0000
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Trying to set-up DedicatedScheduler for parallel universe, but not preempting serial jobs
> On Dec 6, 2016, at 9:47 AM, Carsten Aulbert <carsten.aulbert@xxxxxxxxxx> wrote:
>
> Hi all
>
> I'm currently stuck with this problem and don't really know where to
> continue.
>
> I've set-up two execute nodes with four cores each to only run my jobs (
>
> START = Owner == "carsten"
>
> ). The set-up is with fully partitionable slots, i.e.
>
> SLOT_TYPE_1 = ram=15023, swap=0%
> SLOT_TYPE_1_PARTITIONABLE = True
>
> In the absence of any jobs running on the execute machines, I get both via
>
> machine_count = 2
> request_cpus = 4
>
> So far so good. To allow both parallel and serial jobs I followed the
> second policy of
> https://research.cs.wisc.edu/htcondor/manual/v8.4/3_12Setting_Up.html#SECTION004128000000000000000
> with the only exception for START described above and
>
> PREEMPT = Scheduler =!= $(DedicatedScheduler)
>
> after both PREEMPT = True as well as PREEMPT = false did not really work
> out:
>
> I started four single core jobs on one of the execute nodes and
> hoped/expected HTCondor to preempt those to launch the parallel job, but
> so far to no avail. The empty second execute nodes is fully matched but
> no preemption occurs on the first one.
>
> Other things tried so far:
>
> * setting ALLOW_PSLOT_PREEMPTION = True on negotiatior, schedd and
> execute node
> * reducing various timers to check if these have any say in preemption,
> but so far blanks only:
>
> MaxJobRetirementTime = 1
> MachineMaxVacateTime = 10
> CLAIM_WORKLIFE = 60
>
> As I'm running out of options, anyone succeeded with such a set-up?
For this setup to work, you will need ALLOW_PSLOT_PREEMPTION=True, since you want one 4-core request to preempt four 1-core requests.
It appears that pslot preemption doesnât work for parallel jobs, but that would be easy to fix. I will work on getting this in for a future release.
With the current release, setting ALLOW_PSLOT_PREEMPTION=False, your parallel job could request 8 single-core allocations, like this:
machine_count = 8
request_cpus = 1
Thanks and regards,
Jaime Frey
UW-Madison HTCondor Project