[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_ssh_to_job/interactive jobs with apptainer



Hi Joachim,

- condor_ssh_to_job leads to cgroup errors - which allows anything done here to escape the restrictions (e.g. I can see all GPUs with nvidia-smi here..) - I haven't found a difference here whether I used apptainer- suid or not.

in principle, cgroups are not necessarily handled by apptainer/singularity, which ael primarily with the namespaces.

where do you restrict cgroups wrt to GPU(?) resources, i.e., what controller do you use? If you use drop-ins to the condor systemd unit, these seem not necessarily be propagated to the job cgroup, if you keep them separated. I.e., drop-ins affecting cgroup resourced work on the condor.service slice, but depending on your `BASE_CGROUP` ad in the Condor config, this is a separate slice, that does not inherit from the systemd service unit's slice.

Cheers,
  Thomas

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature