[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Re: Re: condor_ssh_to_job/interactive jobs with apptainer



Hi Thomas,


well, yes that is what I would hope for, but, it's not really working with apptainer - it works nicely with Docker, for example.


The processes I spawn inside the job are subject to the restrictions - as expected.

They are killed if they exceed the memory limits.

Running nvidia-smi inside the normal job script will only show the GPUs that are actually assigned to the job.


When I condor_ssh_to_job (or use interactive jobs), all these restrictions don't apply to the shell spawned by the sshd.

I can run the same command that was killed for memory limits before, without being killed.

nvidia-smi shows all 8 GPUs instead of just the one, that I asked for, ...


So no, the sshd slice is as it seems not a subslice of the job slice - as I understand the StarterLog, this is due to errors while moving the process into the cgroup.

I have little knowledge about how this actually works and what might be potential reasons for this failing...

Hence, I'm happy to learn about hints to debug this!


Best,

- Joachim


Am Dienstag, 27. Mai 2025, 14:54:13 MitteleuropÃische Sommerzeit schrieb Thomas Hartmann:

> Hi Joachim,

>

> ah, that you mean - that should not be a fine, I guess.

>

> The cgroup slice for the ssh_to_job is a child of the job slice within

> the cgroup hierarchy.

> In the cgroup logic, such a child slice is limited to the resource its

> parent has and can only get resources up to its parent total or less.

> E.g., The relative cpu share of such a ssh slice is only a fraction of

> the job slice. Let's say, the job slice has nominally 6% of the node's

> overall CPU time. Then let's set for the ssh child slice 50% of the CPU

> share, then these are just 50% of the 6% of its parent ~~> 3% of the

> node's total at max (assuming that there are no idle cycles etc. pp.)

>

> Cheers,

>    Thomas

>

>

>

> On 27/05/2025 10.32, Joachim Meyer wrote:

> > Hi Thomas,

> >

> > thanks for reaching out!

> >

> >

> > I meant the cgroup restrictions that HTCondor itself imposes - CPU/

> > Memory limits - that usually also includes restrictions to the devices

> > cgroup (https://htcondor.readthedocs.io/en/latest/admin-manual/

> > configuration-macros.html#STARTER_HIDE_GPU_DEVICES <https://

> > htcondor.readthedocs.io/en/latest/admin-manual/configuration-

> > macros.html#STARTER_HIDE_GPU_DEVICES>)

> >

> >

> > HTCondor seems to fail at moving the sshd process into the job's cgroup

> > slice and thus these restrictions don't apply:

> >

> >> 05/21/25 14:30:21 About to exec /usr/sbin/sshd -i -e -f /raid/condor/lib/condor/execute/dir_203901/.condor_ssh_to_job_1/sshd_config

> >

> >  > 05/21/25 14:30:21 ProcFamilyDirectCgroupV2::track_family_via_cgroup

> > error writing to /sys/fs/cgroup/system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > cgroup.subtree_control: Device or resource busy

> >

> >  > 05/21/25 14:30:21 Creating cgroup system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > sshd for pid 204072

> >

> >  > 05/21/25 14:30:21 Successfully moved procid 204072 to cgroup /sys/fs/

> > cgroup/system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > sshd/cgroup.procs

> >

> >  > 05/21/25 14:30:21 Error setting cgroup memory limit of 107374182400

> > in cgroup /sys/fs/cgroup/system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > sshd: No such file or directory

> >

> >  > 05/21/25 14:30:21 Error setting cgroup swap limit of 107374182400 in

> > cgroup /sys/fs/cgroup/system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > sshd: No such file or directory

> >

> >  > 05/21/25 14:30:21 Error setting cgroup cpu weight of 1200 in cgroup /

> > sys/fs/cgroup/system.slice/htcondor/

> > condor_raid_condor_lib_condor_execute_slot1_1@xxxxxxxxxxxxxxxxxxxxxxxxx/

> > sshd: No such file or directory

> >

> >  > 05/21/25 14:30:21 Error enabling per-cgroup oom killing: 2 (No such

> > file or directory)

> >

> >  > 05/21/25 14:30:21 cgroup v2 could not attach gpu device limiter to

> > cgroup: Operation not permitted

> >

> > Any ideas what might be causing this?

> >

> > Thanks!

> >

> > - Joachim

> >

> >

> >

> > Am Dienstag, 27. Mai 2025, 09:47:04 MitteleuropÃische Sommerzeit schrieb

> > Thomas Hartmann:

> >

> >  > Hi Joachim,

> >

> >  >

> >

> >  > > - condor_ssh_to_job leads to cgroup errors - which allows anything

> > done

> >

> >  > > here to escape the restrictions (e.g. I can see all GPUs with

> > nvidia-smi

> >

> >  > > here..) - I haven't found a difference here whether I used apptainer-

> >

> >  > > suid or not.

> >

> >  >

> >

> >  > in principle, cgroups are not necessarily handled by

> >

> >  > apptainer/singularity, which ael primarily with the namespaces.

> >

> >  >

> >

> >  > where do you restrict cgroups wrt to GPU(?) resources, i.e., what

> >

> >  > controller do you use?

> >

> >  > If you use drop-ins to the condor systemd unit, these seem not

> >

> >  > necessarily be propagated to the job cgroup, if you keep them separated.

> >

> >  > I.e., drop-ins affecting cgroup resourced work on the condor.service

> >

> >  > slice, but depending on your `BASE_CGROUP` ad in the Condor config, this

> >

> >  > is a separate slice, that does not inherit from the systemd service

> >

> >  > unit's slice.

> >

> >  >

> >

> >  > Cheers,

> >

> >  >ÂÂÂ Thomas

> >

> >  >

> > --

> >

> > *Joachim Meyer*

> >

> > HPC-Koordination & Support

> >

> >

> > UniversitÃt des Saarlandes

> >

> > /FR Informatik | HPC/

> >

> >

> > Postanschrift: Postfach 15 11 50 | 66041 SaarbrÃcken

> >

> >

> > Besucheranschrift: Campus E1 3 | Raum 4.03

> >

> > 66123 SaarbrÃcken

> >

> >

> > T: +49 681 302-57522

> >

> > jmeyer@xxxxxxxxxxxxxxxxxx

> >

> > www.uni-saarland.de

> >

>

>
--

Joachim Meyer

HPC-Koordination & Support


UniversitÃt des Saarlandes

FR Informatik | HPC


Postanschrift: Postfach 15 11 50 | 66041 SaarbrÃcken


Besucheranschrift: Campus E1 3 | Raum 4.03

66123 SaarbrÃcken


T: +49 681 302-57522

jmeyer@xxxxxxxxxxxxxxxxxx

www.uni-saarland.de