[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Limiting max number of running jobs for a group
- Date: Thu, 21 Sep 2017 11:06:55 +0200
- From: Antonio Delgado Peris <antonio.delgadoperis@xxxxxxxxx>
- Subject: [HTCondor-users] Limiting max number of running jobs for a group
Dear all,
This is my first message to the list, so I'll start by presenting myself
:-) I am writing from CIEMAT institute, at Madrid, Spain, where we have
recently installed a HTCondor cluster (with an HTCondor-CE in front of
it). We're still in the testing phase, but should be moving to
production fairly soon. We'll be serving mostly (but not uniquely) the
LHC CMS experiment.
So moving to my question... we've defined some hierarchical dynamic
group quotas, with surplus allowed, which is nice because we want minor
groups to be able to use the farm if CMS is not running for some reason.
However, we also would like to limit their expansion, so that they
cannot occupy the whole farm (to speed up CMS taking over the farm when
their jobs come back).
Naively, this would be like having both dynamic (soft, fair share-like)
quotas and static (hard) quotas for some groups. But the manual says
that if you define both dynamic and static quotas, the dynamic one is
ignored.
I have looked for another parameter like 'MAX_RUNNING_JOBS_PER_GROUP'
but haven't found anything like that. I have also tried to code some
logic in the START expression using 'SubmitterGroupResourcesInUse', but
it didn't work (I think that attribute is only usable for preemption...
which we don't allow).
We have solved the situation by just reserving some named nodes to CMS,
but I was still curious if there might be a less static solution to the
problem--i.e.: not tied to a fixed set of nodes, but just stating a max
number of simultaneous running jobs.
Thanks for any hints. (And sorry if this question has been replied
earlier... I couldn't find it)
Cheers,
Antonio