[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Condor23/cgroups v2occassionally busy and/or kernel OOM acting



Hi Greg,

sure - our EL9 clusters is currently still "pre-production".
We can switch the channel and try 23.7.2-1.el9.

Cheers and thanks,
  Thomas

On 31/05/2024 19.49, Greg Thain via HTCondor-users wrote:
On 5/31/24 09:04, Thomas Hartmann wrote:
Hi all,

we have been debugging with a user his jobs as these tend to get somewhat randomly OOM killed. It seems to be cgroups v2 related, i.e., on our EL9/Condor23/cgroups v2 workers [1], where the cgroup mount path is
ÂÂ /sys/fs/cgroup/htcondor/
with
 BASE_CGROUP = htcondor


Hi Thomas:

Just chiming in from the other thread, but we think we have a fix for this in htcondor 23.7. Would it be possible for you to try that out?

-greg


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature