Hi all, I just noticed, that a few of our nodes have their jobs not confined in cgroups - i.e., no condor slice at all [1]. These nodes are setup the same and on the same release [2] as the majority of the nodes where the jobs are properly cgrouped. We are going to drain and reboot these nodes, but maybe somebody has an idea, what might have gone wrong here? Cheers, Thomas [1] [root@batch0202 ~]# ls /sys/fs/cgroup/cpu,cpuacct/system.slice/condor.service/condor_var_lib_condor_execute_slot1_* ls: cannot access /sys/fs/cgroup/cpu,cpuacct/system.slice/condor.service/condor_var_lib_condor_execute_slot1_*: No such file or directory [root@batch0203 ~]# ls /sys/fs/cgroup/cpu,cpuacct/system.slice/condor.service/condor_var_lib_condor_execute_slot1_* /sys/fs/cgroup/cpu,cpuacct/system.slice/condor.service/condor_var_lib_condor_execute_slot1_10@xxxxxxxxxxxxxxxxx: cgroup.clone_children ... [2] condor-classads-8.6.11-1.el7.x86_64 condor-8.6.11-1.el7.x86_64 condor-python-8.6.11-1.el7.x86_64 condor-procd-8.6.11-1.el7.x86_64 condor-external-libs-8.6.11-1.el7.x86_64
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature