[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Cgroups v2 and memory limits for WLCG sites



Hi,
We moved to memory limits by cgroup v2. The memory usage of jobs with cgroups v2 is higher than with cgroup v1 monitoring due to the counting of the page cache. We would also appreciate it if Condor did not count the page cache in the memory usage of the jobs.

Initially, we had problems with the "none" CGROUP_MEMEOY_LIMIT_POLICY in Condor since the page cache was also accounted for in the memory usage. The CEs killed jobs that used less than the requested amount of memory, but with the page cache included in the memory usage, it "used"Â more than four times the requested amount of memory.
We now set custom cgroup settings:
ÂÂÂÂÂÂÂ CGROUP_MEMORY_LIMIT_POLICY = custom
ÂÂÂÂÂÂÂ CGROUP_HARD_MEMORY_LIMIT_EXPR = 2 * Target.RequestMemory

With that, the page clean gets triggered, and normal-behaving jobs do not get killed. Jobs that need more than two times the requested memory still get killed.


Regards,

Matthias

On 7/25/24 10:23, Petr Vokac wrote:
Hi,

could you please clarify to us how to use memory limits with HTCondor and cgroups v2? Do we understand correctly that cgroups v2 account also page cache (e.g. disk buffers) to the job (process tree) memory? Such behavior makes cgroups v2 unusable for enforcing memory limits, because it is unpredictable how much page cache is used by our jobs (less stressed machine => potentially more memory accounted by job cgroups v2).

What are our options to enforce reasonable memory limits?

  • don't enforce memory limits by cgroups v2 at all as described in https://opensciencegrid.atlassian.net/browse/HTCONDOR-2521
  • sacrifice a bit of performance by aggressively dropping page case with CGROUP_LOW_MEMORY_LIMIT. Which values should be used? Do you have an idea what is the impact on performance?
  • other options? recommendation? Could cgroups v2 be configured to enforce just process memory limits and don't include page cache?

  • We have sites that moved to cgroups v2 and we started to observe random job failures that are very tricky to understand and sure such debugging is very time consuming. We can easily measure how much memory our jobs needs (e.g. scouting jobs estimating memory usage), but page case size is totally unpredictable to us and this seems to make cgroups v2 memory limits pretty unusable. We would like to have clear and simple instruction for HTCondor batch, because otherwise enforcing memory limits become operational nightmare with distributed infrastructure where each site invents their own solution (or even keep killing jobs on page cache size).

    Petr

    _______________________________________________
    HTCondor-users mailing list
    To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
    subject: Unsubscribe
    You can also unsubscribe by visiting
    https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
    
    The archives can be found at:
    https://lists.cs.wisc.edu/archive/htcondor-users/