Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Memory accounting issue with cgroups
- Date: Sat, 20 May 2023 18:43:03 -0500
- From: Greg Thain <gthain@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Memory accounting issue with cgroups
On 5/20/23 5:03 AM, Marco van Zwetselaar wrote:
I guess my mental picture of memory.high as a yellow card, and
memory.max as the red card was incorrect. It's more like rugby: the
referee's stare is enough. :-)
Hi Marco:
I'm glad it is working for you now. We don't have a lot of experience
with the policy settings for cgroup v2, and would be eager to hear
experiences or advice on what they should be set to. The kernel docs
are a little vague about the difference between "high" and "max",
saying that usually a cgroup gets OOM killed when it hits "high", but in
some cases can go all the way up to "max" before the OOM arrives. It
isn't clear to me if this means maybe a page or two more memory, in
order to deliver the signal, or potentially some unbounded amount of
memory. Given that, I chose to have condor only set "max".
If you will excuse me stretching your metaphor, "high" is the moment the
red card goes into the air, but "max" is when the guilty party actually
leaves the pitch. "memory.min" is like our youth leagues here, where
there is an unwritten understanding that if one team can't field some
minimum number of players (seven?), the opposing team (if able) will
loan them some players in order that the kids can still get a game in
(despite a forfeit on the books). And I have no good idea right now
what htcondor should set "memory.low" to.
On a side note to the Condor devs: my config has 'DISABLE_SWAP_FOR_JOB
= true'. Shouldn't that translate to 'memory.swap.max = 0' on the
cgroup (currently shows "max")?
The cgroup v2 code path doesn't set this. I'll write a PR to fix this.
Thanks,
-greg