Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Out of memory killer & cgroups
- Date: Thu, 02 Oct 2014 09:58:42 -0500
- From: Greg Thain <gthain@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Out of memory killer & cgroups
On 10/02/2014 09:19 AM, Rich Pieri wrote:
The Linux kernel OOM killer is independent of cgroups. The only
interaction with cgroups is that a cgroup memory limit may cause the
OOM killer to activate sooner than it would without a constrained
memory limit. SIGKILL (kill -9) is immediate and it cannot be trapped
or ignored. The killed process does not have a chance to write out any
logs or otherwise clean up after itself so there's nothing that it can
do to let users know why it was killed. What the parent does is up to
the parent, although right off it has no way to distinguish between a
KILL signal sent by the kernel and a KILL signal sent by a user. So,
on the face of it, the behavior that you are seeing is something that
I would expect to see. Whether or not it's the intended behavior is
something that I will leave to the Condor devs to address.
Note this isn't entirely correct. A process can register to have the
cgroup memory controller notify it when the per-cgroup OOM fires. This
is what the condor_starter does, so that it can differentiate between
the OOM killer firing, and some other reason the job was kill'ed -9.
-Greg