Hi,
We moved to memory limits by cgroup v2. The memory usage of jobs with cgroups v2 is higher than with cgroup v1 monitoring due to the counting of the page cache. We would also appreciate it if Condor did not count the page cache in the memory usage of the jobs.
Initially, we had problems with the "none" CGROUP_MEMEOY_LIMIT_POLICY in Condor since the page cache was also accounted for in the memory usage. The CEs killed jobs that used less than the requested amount of memory, but with the page cache included in the memory usage, it "used" more than four times the requested amount of memory.
We now set custom cgroup settings:
CGROUP_MEMORY_LIMIT_POLICY = custom
CGROUP_HARD_MEMORY_LIMIT_EXPR = 2 * Target.RequestMemoryWith that, the page clean gets triggered, and normal-behaving jobs do not get killed. Jobs that need more than two times the requested memory still get killed.
--
| Andreas Haupt | E-Mail: andreas.haupt@xxxxxxx | DESY, Zeuthen | WWW: http://www.zeuthen.desy.de/~ahaupt | Platanenallee 6 | Phone: +49/33762/7-7359 | D-15738 Zeuthen |
Attachment:
smime.p7s
Description: S/MIME cryptographic signature