On 5/31/24 09:04, Thomas Hartmann wrote:
Hi all,we have been debugging with a user his jobs as these tend to get somewhat randomly OOM killed. It seems to be cgroups v2 related, i.e., on our EL9/Condor23/cgroups v2 workers [1], where the cgroup mount path is /sys/fs/cgroup/htcondor/ with BASE_CGROUP = htcondor
Hi Thomas:Just chiming in from the other thread, but we think we have a fix for this in htcondor 23.7. Would it be possible for you to try that out?
-greg