Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Advertised Memory of an execute point running in Apptainer.
- Date: Fri, 21 Mar 2025 20:53:46 +0000
- From: Jaime Frey <jfrey@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Advertised Memory of an execute point running in Apptainer.
How did you attempt to set the Memory advertised by the startd?
Setting MEMORY in the config file should work (without quotes).
For example, to advertise 4GB you would use this:
MEMORY = 4096
- Jaime
> On Mar 21, 2025, at 10:35âAM, Matthias Schnepf <matthias.schnepf@xxxxxxx> wrote:
>
> Hi all,
>
> We try to run an HTCondor execute point (master, startd) in an apptainer container within a SLURM job.
> HTCondor starts and is able to run jobs. However, the advertised memory is always the memory of the host system or when I limit the memory via apptainer (apptainer run --memory ....).
> I tried setting the "Memory" class-ad in the config of the execute point, but it has no effect.
> I also tried it withMemory = "foo". The execute point still advertises the memory of the host system and accepted jobs. The start of the job failed because it tried to evaluate Memory="foo" and complained that it is no integer.
>
> Do I need to set something?
> The Slurm WN is a Rockylinux 9.5. I tried HTCondor versions 23.10.18 and 24.0.5. The Slurm cluster sund the cgroup v2 plugin slurmstepd.
> Could it be a problem with how the groups are set up?
> The memory limit of the job itself is set to what the job requests:
>
> cat /sys/fs/cgroup/system.slice/slurmstepd.scope/job_3679/memory.max
> 536870912000
>
> But the process of HTCondor runs in a sub-cgroup where the limit is set to max
>
> cat /sys/fs/cgroup/system.slice/slurmstepd.scope/job_3679/step_batch/user/task_0/memory.max
> max
>
> Best regards,
>
> Matthias
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
>
> Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!O02NQaMRHdP6euYJihsKyIObh_GEUnqvu_sJcT-D50xGAmUaCI1qiTSmhpnFZjHWeeV-pm_M4CZ2XZqEwQXRkIPqHzOoag$
> The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/