[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Dynamic slots and limits by job profiles



Hi Charles,

maybe a bit hacky, but maybe you could use 'licenses' with concurrency limits or consumption policies? Something in the direction of defining 'Rendering', 'Simulation',... as resources that could be requested - and include it in the dynamic slot's definition as 'consumable', so that each heavy job jumping on a dynamic slot uses up such a consumable.

Cheers,
  Thomas

https://htcondor.readthedocs.io/en/latest/admin-manual/setting-up-special-environments.html#concurrency-limits
https://chtc.cs.wisc.edu/uw-research-computing/licensed-software.html
https://htcondor.readthedocs.io/en/latest/admin-manual/policy-configuration.html
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=ConsumptionPolicies

On 23/03/2023 11.26, Charles Goyard wrote:
Hi all,

again I'm requesting some help !


On our HTCondor pool (v10.0), we have dynamic slots enabled.

Smaller machines can accept 1 job, medium ones accept 2 jobs, and the
higher-end computers accept up to 4 jobs.

Say our users run 3 types (or profiles) of jobs:

- Compositing
- Rendering
- Simulation

The jobs declare a property with the job type (in an env var for example).

Running multiple Compositing tasks at once does not cause a problem, but
having 4 Rendering or Simulation is not really great.

What I would like to achieve is to define an expression on the execute
nodes that says :

- accept at most one Simulation job at once.
- accept at most two Rendering jobs at once.
- accept any number of Compositing jobs.

So for example a single execute node could run :

sim.    render  comp.   total
-----------------------------
1       2       1       4
1       1       2       4
1       0       3       4
0       2       2       4
0       1       3       4
0       0       4       4


Since there are only 6 cases, I'm ok this building a super-long
expression :).

I found a recipe that looks like this kind of thing, but for static
slots here : https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToReserveSlotForSpecialJobs .

But I can't get my head around on how to achieve this setup with dynamic
slots. How can I get information on the type of job running at a given
time on a execute node ? (this sounds a bit like my question about how
to count the number of dynamic slots discussed in December).

Thanks,

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature