Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Preempt a job when memory usage to higher than requested, only if total system memory is getting low
- Date: Wed, 15 Feb 2023 20:04:06 +0100 (CET)
- From: "Beyer, Christoph" <christoph.beyer@xxxxxxx>
- Subject: Re: [HTCondor-users] Preempt a job when memory usage to higher than requested, only if total system memory is getting low
Hi Charles,
did you check the option CGROUP_MEMORY_LIMIT_POLICY - Ithink it does pretty much what you want if you set it to soft.
The configuration variable CGROUP_MEMORY_LIMIT_POLICY controls this. If CGROUP_MEMORY_LIMIT_POLICY is set
to the string hard, the hard limit will be set to the slot size, and the soft limit to 90% of the slot size.. If set to soft, the
soft limit will be set to the slot size and the hard limit will be set to the memory size of the whole startd. By default, this
whole size is the detected memory the size, minus RESERVED_MEMORY. Or, if MEMORY is defined, that value is
used..
We use the system periodic hold to put a limit on it (3 x times requested memory is tolerated)
HoldOverMem = (ifThenElse(ResidentSetSize =!= UNDEFINED, ResidentSetSize,1) > 3000 * RequestMemory)
HoldOverMemReason = "Memory usage higher than 3 x requested memory"
SYSTEM_PERIODIC_HOLD = $(HoldOverMem)
SYSTEM_PERIODIC_HOLD_REASON = $(HoldOverMemReason)
Best
christoph
--
Christoph Beyer
DESY Hamburg
IT-Department
Notkestr. 85
Building 02b, Room 009
22607 Hamburg
phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
----- UrsprÃngliche Mail -----
Von: "Charles Goyard" <cgoyard@xxxxxxx>
An: "HTCondor-Users Mail List" <htcondor-users@xxxxxxxxxxx>
Gesendet: Mittwoch, 15. Februar 2023 19:49:42
Betreff: [HTCondor-users] Preempt a job when memory usage to higher than requested, only if total system memory is getting low
Hi,
now that we have dynamic slots on our pool, we enjoy the noisy neighbor
problem.
That is, some users correctly set their request_memory parameter, and
some don't. This can lead to an unfair situation where badly configured
jobs penalize the good citizens.
I found out the configuration template to evict jobs that use more than
requested, and I'm planning to put is to good use. But let's add a grain
of salt.
What I would like to achieve, is to allow jobs to eat more cake that
expected, as long as there is no memory pressure at the system level (a
bit like how group quota surplus work).
How can I come up with an expression that evaluates the total free (or
used) memory on a compute node? Can I gather memory information from
other slots?
Something like :
PREEMPT=( (MemoryUsage > Memory) && ( SumOfMemoryUsageAcrossSlots > (
TotalComputerMemory * 0.95 ) ) )
Thanks !
--
Charles
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/