On 18/06/14 12:13, Dave Macey wrote:
Hi,
I'm testing the use of Cgroups in HTCondor version 8.0.7,
following the manual instructions as per section 3.12.12, but
according to the ProcLog this is failing with:
06/17/14 15:52:40 : Setting cgroup to
"htcondor"/condor_home_condor_execute_slot1_1@xxxxx for
ProcFamily 3853.
06/17/14 15:52:40 : Cannot attach pid 3853 to cgroup
"htcondor"/condor_home_condor_execute_slot1_1@xxxxx for
ProcFamily 3853: 50016 No space left on device
Since I'm testing under Debian 7.5, which by default sets up all
subsystems in the same directory, my /etc/cgconfig.conf file
looks like:
mount {
cpu = /sys/fs/cgroup;
cpuset = /sys/fs/cgroup;
cpuacct = /sys/fs/cgroup;
memory = /sys/fs/cgroup;
freezer = /sys/fs/cgroup;
blkio = /sys/fs/cgroup;
}
group htcondor {
cpu {}
cpuacct {}
memory {}
freezer {}
blkio {}
cpuset {
cpuset.cpus = 0-3; # It's a four core machine
cpuset.mems = 0; # Recommended by numerous online posts
}
}
Another Debian vagary is that the memory subsystem is not
available by default, which requires it to be loaded via a
kernel boot option, but that all works and I can see the
directory /sys/fs/cgroup/htcondor suitably populated. Trawling
the internet would suggest that the problem is usually due to an
empty cpuset.mems field, but I've covered that, so I'd be
grateful for any ideas where the problem might be.
Dave,
The problem is that although you are setting cpuset.cpus and
cpuset.mems at the /sys/fs/cgroup/htcondor level, these are not
being propagated to the subdirectories that HTCondor produces, which
would have the problematic empty values. I got round this by using
the cgroup.clone_children property, e.g. in my /etc/rc.local I have:
/usr/bin/cgconfigparser -l /etc/cgconfig.conf
/bin/echo 1 > /sys/fs/cgroup/htcondor/cgroup.clone_children
Hope that help,
Mark
|