Hi Gred, many thanks - turning on cgroup delegation seem to do the trick!! :) With the delegate option on for all controllers in the Condor unit's service section [1], the job slices survived all new/restarts of other units! I will prepare and test some more nodes over the weekend, but I have my fingers crossed, that my issue is fixed ;) Cheers and thanks, Thomas [1] [Service] ... Delegate=true [2] There are a few guides out there also mentioning, that the resource-control options have to go into an own [Slice] section. But at least for CentOS7.5/3.10.0 and libcgroup-0.41-15 an separate [Slice] section is ignored and the resource control settings only get picked up when in the [Service] section. Unfortunately, systemd seems to be not very verbose when ignoring options, so better re-check the unit's values to be sure when on other versions... > systemctl show --all condor.service | grep Delegate Delegate=yes https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html On 2018-06-28 19:14, Greg Thain wrote: > On 06/25/2018 06:59 AM, Thomas Hartmann wrote: >> >> So far, I have not found a way how to debug/understand, what systemd or >> condor are doing in detail during the service start - especially since I >> do not see, how the condor unit and its cgroups are related to another >> service unit. > > I must admit, I am not a systemd expert, but does adding "Delegate = > true" to the condor unit help with the problem? > > -greg > _______________________________________________ > HTCondor-users mailing list > To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a > subject: Unsubscribe > You can also unsubscribe by visiting > https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users > > The archives can be found at: > https://lists.cs.wisc.edu/archive/htcondor-users/
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature