Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?

Date: Tue, 25 Nov 2025 10:21:43 +0100
From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
Subject: Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?

Thanks Greg -

- for your quick response - which is a bit disillusioning I admit, but I'm
not willing to give up yet ;)

On Mon, 2025-11-24 at 11:28:40 -0600, HTCondor Users Mailinglist wrote:
> There are several different ways to share resources between slurm and
> htcondor, working with disparate systems is one of the challenges of the
> distributed world.

You tell me something as this setup is inhomogeneous and will be more of
that with the next extension :)

>                    Looks to me like you are suggesting what we would call
> "gliding in" a slurm over HTCondor.

Call it like that, indeed. But nobody seems to do HPC over HTC while the
other way is rather common.

>                                     I'm not aware of anyone doing this.

I didn't find any hint on the 'net yet, which may have almost any reason :)

>                                                                         My
> understanding is that slurm really wants to run as root.  We in HTCondor try
> to prevent jobs from running as root, even when running in docker
> containers.

Now this is a heavy argument. As I'm more into apptainer/singularity than
Docker, I'm wondering whether there would be a way around that though (and
I've seen user software run as root inside a Docker container recently (not
in a HTCondor context though), so this must be possible somehow - time to
ask the responsible programmer to share his secrets!).

>             The more common way is to run HTCondor and slurm "next to" each
> other,

This is my current approach ...

         where perhaps both have root and are started by systemd/init, and one
> disables the other when it has work to do.

In practice, HTCondor starts up with the machine, controlled by a systemd unit,
while the node needs to be "drained" from HTCondor work (by setting START=False
and IS_OWNER=True), possibly defragmented ... then the slurmd is fired up.
(I could even have the slurmd running all the time, but compared to the condor_*
daemons, it consumes more memory ... which is almost always a bottleneck... hm.)

I've come up with some semi-automatic mechanism to "convert" nodes back and
forth between the two schedulers, controlled by a certain pattern in the reason
provided for the state change. The latter needs to be set manually right now.
(Sometimes I wish HTCondor had such a central "memory" of which nodes are to
be used.)

>                                            I believe we have seen users do
> this with slurm prescripts and postscripts.

You're right, that would be a convenient place to do this! Investigating...
For the HTCondor side, perhaps STARTD_CRON tasks could do something similar,
although it's basically an alternative to Hibernation (as those transfers
should only be done on Unclaimed)... there are too many knobs and states!

> Let us know what you learn and how you decide to go!

Learning is slow these days, but I'll certainly share what I find ;)

Thanks,
 Steffen

Follow-Ups:
- Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
  - From: Thomas Hartmann
- Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
  - From: Michael DiDomenico

References:
- [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
  - From: Steffen Grunewald
- Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
  - From: Greg Thain

Prev by Date: Re: [HTCondor-users] cgroup job scope
Next by Date: Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
Previous by thread: Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
Next by thread: Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

Re: [HTCondor-users] Run Slurm as "guest" on a HTCondor pool?