Hi Andrew and Alessandra, at us with condor 8.6.5-2 on EL7 (3.10.0-514.26.2) the cgroup soft limit is set according to the job's requested memory as far as I can see. E.g., the six 8cores on the node [1] have either 4GB or 12GB as requested the nodes condor setting on cgroups are currently BASE_CGROUP = /system.slice/condor.service CGROUP_MEMORY_LIMIT_POLICY = soft i.e., no container universe Cheers, Thomas [1] /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 4294967296 /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_2@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 4294967296 /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_3@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 4294967296 /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_4@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 4294967296 /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_5@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 12616466432 /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_6@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes 4294967296 On 2017-10-24 11:07, andrew.lahiff@xxxxxxxxxx wrote: > Hi Alessandra, > > There seems to have been a change in behavior with respect to how HTCondor configures cgroups. With older versions of HTCondor, it used to set memory.soft_limit_in_bytes when using soft memory limits (at least this is what I remember). > > However, now (e.g. in 8.6.6) memory.soft_limit_in_bytes seems to be set to the total memory of the machine, and memory.memsw.limit_in_bytes is set at memory that the job requested. We use the Docker universe now so in our case it's Docker that's creating the cgroups. > > Regards, > Andrew. > > ________________________________ > From: HTCondor-users [htcondor-users-bounces@xxxxxxxxxxx] on behalf of Alessandra Forti [Alessandra.Forti@xxxxxxx] > Sent: Tuesday, October 24, 2017 9:39 AM > To: htcondor-users@xxxxxxxxxxx > Subject: Re: [HTCondor-users] htcondor cgroups and memory limits on CentOS7 > > Hi Thomas, > > > On 24/10/2017 09:17, Thomas Hartmann wrote: > > Hi Todd, (sorry to fork in between) > > I am a bit confused regarding the soft limits. > > So far I had assumed that the kernel would allow a cgroup to exceed its > soft limit usage as long as there is free memory available > > do you set the limit or your htcondor does? because my htcondor doesn't set that limit. Maybe I'm doing something wrong. > > - and kill a > group's processes if the system runs low on unwired memory (assuming a > translation between limits in condor to cgroup limits). > > > So, we have effectively not set a 'real' cgroup hard limit assuming that > the soft limit would be sufficient, e.g., would the kernel kill [1] when > exceeding it's 4GB soft limit and running low on system-wide memory? > > no the kernel doesn't kill with the soft limit. This is why system periodic remove is needed. > > (looking now onto the values: would memsw -set to such a large value- > actually send the job heavily swapping...?) > > > infact memsw is the place where RAM+swap is limited. However as pointed out in the thread you may end up with a job which has 0 memory and 4GB of swap. > > > Cheers, > Thomas > > > > [1] > /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_6@xxxxxxxxxxxxxxxxx/memory.limit_in_bytes > 142668537856 > /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_6@xxxxxxxxxxxxxxxxx/memory.memsw.limit_in_bytes > 142668541952 > /sys/fs/cgroup/memory/system.slice/condor.service/condor_var_lib_condor_execute_slot1_6@xxxxxxxxxxxxxxxxx/memory.soft_limit_in_bytes > 4294967296 > > > On 2017-10-20 18:26, Todd Tannenbaum wrote: > > > On 10/20/2017 9:44 AM, Alessandra Forti wrote: > > > Hi, > > is more information needed? > > > > Hi Alessandra, > > The version of HTCondor you are using would be helpful :). > > But I have some answers/suggestions below that I hope will help... > > > > * On the head node > > RemoveMemoryUsage = ( ResidentSetSize_RAW > 2000*RequestMemory ) > SYSTEM_PERIODIC_REMOVE = $(RemoveMemoryUsage) || <OtherParameters> > > So the questions are two > > 1) Why SYSTEM_PERIODIC_REMOVE didn't work? > > > Because the (system_)periodic_remove expressions are evaluated by the > condor_shadow while the job is running, and the *_RAW attributes are > only updated in the condor_schedd. > > A simple solution is to use attribute MemoryUsage instead of > ResidentSetSize_RAW. So I think things will work as you want if you > instead did: > > RemoveMemoryUsage = ( MemoryUsage > 2*RequestMemory ) > SYSTEM_PERIODIC_REMOVE = $(RemoveMemoryUsage) || <OtherParameters> > > Note that MemoryUsage is in the same units as RequestMemory, so only > need to multiply by 2 instead of 2000. > > You are not the first person to be tripped up by this. :( I realize it > is not at all intuitive. I think I will add a quick patch in the code to > allow _RAW attributes to be referenced inside of job policy expressions > to help prevent frustration by the next person. > > Also you may want to place your memory limit policy on the execute nodes > via startd policy expression, instead of having them enforced on the > submit machine (what I think you are calling the head node). The reason > is the execute node policy is evaluated every five seconds, while the > submit machine policy is evaluated every several minutes. A runaway job > could consume a lot of memory in a few minutes :). > > > > 2) Shouldn't htcondor set the job soft limit with this configuration? > or is the site expected to set the soft limit separately? > > > > Personally, I think "soft" limits in cgroups are completely bogus. The > way the Linux kernel treats soft limits does not do in practice what > anyone (including htcondor itself) expects. I recommend settings > CGROUP_MEMORY_LIMIT to either none or hard, soft makes no sense imho. > > "CGROUP_MEMORY_LIMIT=hard" is clear to understand: if the job uses more > memory than it requested, it is __immediately__ kicked off and put on > hold. This way users get a consistent experience. > > If you want jobs to be able to go over their requested memory so long as > the machine isn't swapping, consider disabling swap on your execute > nodes (not a bad idea for compute servers in general) and simply leaving > "CGROUP_MEMORY_LIMIT=none". What will happen is if the system is > stressed, eventually the Linux OOM (out of memory killer) will kick in > and pick a process to kill. HTCondor sets the OOM priority of job > process such that the OOM killer should always pick job processes ahead > of other processes on the system. Furthermore, HTCondor "captures" the > OOM request to kill a job and only allows it to continue if the job is > indeed using more memory than requested (i.e. provisioned in the slot). > This is probably what you wanted by setting the limit to soft in the > first place. > > I am thinking we should remove the "soft" option to CGROUP_MEMORY_LIMIT > in future releases, it just causes confusion imho. Curious if others on > the list disagree... > > Hope the above helps, > regards, > Todd > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx<mailto:htcondor-users-request@xxxxxxxxxxx> with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/ > > > > > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx<mailto:htcondor-users-request@xxxxxxxxxxx> with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/ > > > -- > Respect is a rational process. \\// > Fatti non foste a viver come bruti, ma per seguir virtute e canoscenza(Dante) > For Ur-Fascism, disagreement is treason. (U. Eco) > But but but her emails... covfefe! > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/ >
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature