[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] incomplete output files



On Thu, Nov 19, 2020 at 3:30 PM Tim Theisen <tim@xxxxxxxxxxx> wrote:
>
> You don't say which version of HTCondor you are using. However, we have
> improved cgroup memory management in the recent 8.9.9 release.
> Previously, you could choose to set either the hard or the soft limit.
> Now, HTCondor will always set both the hard and soft limit in cgroups.
> Here is a quick summary:

Tim,

Thanks for getting back to me.  Sorry we're stuck on 8.8.6 until i can
get a fix for the boost 1.69 requirements.

this is good news, i was having trouble with the 8.8 documentation on
cgroup memory limits.  i think in my case i have two problems;

1 soft vs hard, i think in hindsight it would be better if i had a
soft limit rather than a hard limit.  i believe there's some kind of
buffer cache accounting taking place that either is or is not getting
recorded and causing an exhaustion of the cgroup limit.  (i'm still
testing this though)

2 the fact that the exhaust is taking place, but not triggering the
OOM is a second issue.  there seems to be something getting trapped
somewhere that causes the fclose or even an fflush to fail, but not
trigger the OOM when the hard limit is set.  This is really the root
of my concern.  if the user didn't ask for enough memory and then went
over it, that's their issue, but if they did and we failed to hold/OOM
the job that's mine.

right now the way the system functions it looks like a total failure
of the O/S or filesystem because nothing gets trapped (that i've found
so far) and the output files are not being completely written out.  In
the meantime the specific user in question is going to 1.5x their
memory requirement which seems to avoid the issue.

as a possible enhancement, it would be nice if i could set a fixed
soft/hard limit on the job.  The 'hard' policy in 8.9 seems like a
good solution, but I routinely have users that underestimate the needs
of their jobs.  It would be nice if I could have a soft limit of 90%
of their request_memory, and then a hard limit that was up to 10-20%
over that limit.  i could then preempt or periodically remove those
jobs that are over for some period of time.  i suppose i could also
just mess with their submitted classads to manually adjust the
request_memory number up as well