Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] More thoughts on memory limits
- Date: Wed, 4 Dec 2024 07:10:02 +0100 (CET)
- From: "Beyer, Christoph" <christoph.beyer@xxxxxxx>
- Subject: Re: [HTCondor-users] More thoughts on memory limits
Hi,
we definetely need the broken slot code asap as we deal endlessly with unkillable job executables. I just planned this morning to wine about it here ;)
We even more deadly need the max memory usage back into the job-classadds and history - couldn't you just add a new classadd like memory.current and leave the old one as is ?
Best
christoph
--
Christoph Beyer
DESY Hamburg
IT-Department
Notkestr. 85
Building 02b, Room 009
22607 Hamburg
phone:+49-(0)40-8998-2317
mail: christoph.beyer@xxxxxxx
----- UrsprÃngliche Mail -----
Von: "Greg Thain via HTCondor-users" <htcondor-users@xxxxxxxxxxx>
An: "htcondor-users" <htcondor-users@xxxxxxxxxxx>
CC: "Greg Thain" <gthain@xxxxxxxxxxx>
Gesendet: Montag, 2. Dezember 2024 23:59:02
Betreff: Re: [HTCondor-users] More thoughts on memory limits
On 12/2/24 10:10 AM, Beyer, Christoph wrote:
> Hi,
>
> memory.current might be interesting for someone but memory.peak could nonetheless go into another job classadd - not having access to it makes memory management pretty much impossible on many levels ?
Note that what happens is that HTCondor today polls the memory.current,
and keeps the peak value internally, and reports that peak in the job
ad. The polling frequency is controllers by the knob
STARTER_UPDATE_INTERVAL.
We are adding support for the notion of a "broken" slot, so that if
there is an unkillable process, the slot will go into the "broken"
state. When this goes in, I think we can go back to using the
cgroup.peak memory usage and reporting that.
-greg
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/