[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preempt a job when memory usage to higher than requested, only if total system memory is getting low



Hi Todd,

thanks for the explanations and hints.

So I tried to set the policy to hard, to no avail. It looked like
cgroups do not work at all!

Then, I looked into the StarterLog (always a good idea, isn't it?), and
I found out that Starter is complaining about cgroups not being
available. However, they are, but in cgroup2, not in cgroups1 flavor.

The changelog says that cgroups2, which is the default on Debian 11, is
in preliminary support starting at HTCondor 10.2. Since I use HTCondor
10.0.1, I changed the kernel parameters to add :

systemd.unified_cgroup_hierarchy=false quiet

I then got these mountpoints:

cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup  on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,name=systemd)
cgroup  on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup  on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup  on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup  on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup  on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup  on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup  on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup  on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup  on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup  on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)
cgroup  on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)


With this submit file:
======================

# memory_hog is an infinite loop of malloc(100000)
Requirements   = (Machine == "work1.sta.buf.com")
executable     = memory_hog
request_memory = 4000
queue


Log before change (cgroups2):
=============================

About to exec memory_hog
Create_Process succeeded, pid=1534
Cgroup controller for I/O statistics is not available.
Cgroup controller for memory accounting is not available.
Cgroup controller for CPU is not available.
Unable to set CPU shares because cgroup is invalid.
Unable to set memory limit because cgroup is invalid.
Unable to set memory limit because cgroup is invalid.
Unable to set memory limit because cgroup is invalid.
Memcg is not available; OOM notification disabled for starter.

Log after change (cgroup1):
===========================

About to exec memory_hog
Create_Process succeeded, pid=1327
Limiting (soft) memory usage to 0 bytes
Limiting memsw usage to 9223372036854775807 bytes
Limiting (hard) memory usage to 4294967296 bytes
Limiting (soft) memory usage to 3865051136 bytes
Job was held due to OOM event: Job has gone over memory limit of 3686 megabytes. Peak usage: 3910 megabytes.


So now I can implement your advice of CGROUP_MEMORY_LIMIT_POLICY=hard.

I'll also try the soft version, to see if it is close to the "enforce
limits only when memory pressure is high" situation. I'm not sure I
understand whether the 4th point enables this or not.


To my knowledge, we do not use cgroups limits in other parts of the
system, so there should be no to little side effects with the change of
cgroups version.



Thanks and enjoy the week-end!

-- 
Charles