[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Limiting memory used on the worker node with c-groups



JM-

When somethingÂin the universe goes wrong with HTCondor and CGroups, I feel a little twitch. When you say the processes are in the "deferred" state, do you mean they are in the "D" state according to ps? Or do you mean the actual literal "job deferral" options in "htcondor"?

"Job deferral allows the specification of the exact date and time at which a job is to begin executing"

From what you've written, I think you mean "jobs in the D state". Here's the low down on that:

https://support.microfocus.com/kb/doc.php?id=7002725

A common reason for a job getting stuck in D is a bad / overloaded remote filesystem (NFS, etc.). Is that a possibility here?

FYI: even if you didn't understand my presentation, you made the type of choice I recommend. Use "soft" but lie a bit about how much RAM you have. It allows more jobs to match while still ensuring that CGroups can do its job.

Tom

On Fri, Apr 24, 2020 at 1:45 AM Jean-Michel Barbet <jean-michel.barbet@xxxxxxxxxxxxxxxxx> wrote:
Hello,

Having had many times worker nodes hanging because of memory exhaustion,
I am trying to figure out how we can prevent this. I believe the memory
exhaustion is due to some kind of pathologic job using way more memory
than it should.

The first question would be : does it make sense to use
SYSTEM_PERIODIC_REMOVE in the config of a worker node (startd) or is it
working only on the scheduler (thus reacting with a certain delay) ?

Then, I tried differents settings of CGROUP_MEMORY_LIMIT_POLICY.

I understand that the default setting is : "none". In this case, in
/sys/fs/cgroup/memory/htcondor/condor_dlocal_htcondor_slot1\@worker,
"memory.limit_in_bytes" is set to the nodes detected memory divided by
the number of cores and "memory.soft_limit_in_bytes" is 0.

I tried setting CGROUP_MEMORY_LIMIT_POLICY to "soft". It seems to do its
job with jobs being remove with "Job has gone over memory limit of 6000
megabytes. Peak usage: 5926 megabytes." BUT: The result on the worker
nodes is a number of processes in "Deffered" status which gives a high
Unix load even if there is no CPU consumed. No new jobs are scheduled.
Looks like the jobs are not killed cleanly.

I am now trying with "hard". Let's see...

I have read this presentation :
https://research.cs.wisc.edu/htcondor/HTCondorWeek2017/presentations/WedDownes_cgroups.pdf
... but I do not understand everything. Sorry.

This is HTCondor version 8.6.13. Also, please note that I have made
is so that the threshold is higher than the detected memory :

MEMORY = 1.5 * quantize( $(DETECTED_MEMORY), 1000 )
MODIFY_REQUEST_EXPR_REQUESTMEMORY = quantize(RequestMemory,100)

Thank you in advance.

JM

--
------------------------------------------------------------------------
Jean-michel BARBETÂ Â Â Â Â Â Â Â Â Â | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France  | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: barbet@xxxxxxxxxxxxxxxxx
------------------------------------------------------------------------
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/