[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] More thoughts on memory limits




On 12/2/24 10:10 AM, Beyer, Christoph wrote:
Hi,

  memory.current might be interesting for someone but memory.peak could nonetheless go into another job classadd - not having access to it makes memory management pretty much impossible on many levels ?


Note that what happens is that HTCondor today polls the memory.current, and keeps the peak value internally, and reports that peak in the job ad. The polling frequency is controllers by the knob STARTER_UPDATE_INTERVAL.

We are adding support for the notion of a "broken" slot, so that if there is an unkillable process, the slot will go into the "broken" state. When this goes in, I think we can go back to using the cgroup.peak memory usage and reporting that.


-greg