On Sep 4, 2013, at 8:54 AM, Joan J. Piles <jpiles@xxxxxxxxx> wrote:
> Hi all,
>
> We have recently started using cgroups to limit the RAM usage of our users. We want to avoid the situation where badly predicted requirements can bring down the whole machine where the job is executing, impacting other users' jobs.
>
> HTCondor does a great job using cgroups to achieve this, but I have the feeling that this can be improved.
>
> Right now, RAM usage is limited whilst swap is not. I am aware that you can tune this using swappiness parameters, but it is neither straightforward nor optimum, and furthermore it is something difficult to do on a per job basis.
>
> Right now HTCondor tunes the memory.limit_in_bytes or memory.soft_limit_in_bytes files within the cgroup to limit the RAM usage.
>
> I think HTCondor could provide a "request_swap" parameter in the submit file (and a RequestSwap associated job ClassAd) that whould be used to compute the value for memory.memsw.limit_in_bytes (which would of course be RequestMemory + RequestSwap).
>
> There would also be the associated MODIFY_REQUEST_EXPR_REQUESTSWAP which could be used (for instance) to limit the amount of swap reserved to a % of the RAM or to provide a sensible (or even unlimited) default.
>
> What do you think about this idea? I think it could easily piggyback on the existing cgroup infrastructure without too much hassle.
>
Hi Joan,
I'm not too hot on this idea - how does the user know what value to provide for RequestSwap? Determining a working set size for an application is a black art; knowing the memory requirements is hard enough for most users!
Brian
|