Hi all,
This ticket might interest some of you:
Basically, it uses cgroups to enable a "soft limit" for
memory. This allows a job to exceed the memory available to the
slot (as designated by the MEMORY attribute in the ClassAd),
but, under memory pressure, the kernel will only swap out jobs
that are over their memory limit.
Example: Suppose, there is 2GB per slot, job 1 uses 3GB, and
job 2 uses 1GB. Then, if the node starts to hit memory
pressure, job 1 will see about 1GB of its memory swapped out,
while job 2 won't see any swapping.
The patch also allows for a "hard limit" on memory, where the
job is sent to swap as soon as it hits the limit. However, the
"soft limit" approach feels to be "more Condor-like", as this
would allow better utilization of the available resources.
[Additionally, this (and a follow-up ticket, to-be-filed)
will serve to increase the likelihood the cgroup will get
cleaned up properly even if the starter segfaults.]
I'm interested in getting feedback on the approach (and
guidance in getting it committed) - feel free to subscribe
yourself to the ticket if you find it interesting.
Brian
_______________________________________________
Condor-devel mailing list
Condor-devel@xxxxxxxxxxx
https://lists.cs.wisc.edu/mailman/listinfo/condor-devel