Memory should be the amount of memory on the slot for Static and
dynamic slots.
For Partitionable slots it is the amount of Memory that has not yet
been moved to a dynamic slot
under that Partitionable slot. (i.e. free memory)
But there is a flaw in my second suggestion.
We want the start_cpu_jobs test to apply only to the partitionable
slot, and not to the dynamic slots
created under it, otherwise the dynamic slots may not match the jobs
we just created them for.
This is probably what you are seeing.
To add a test for the dynamic slot you can do it inside start_cpu_jobs
 Âstart_cpu_jobs = ( DynamicSlot ?: (Memory - TARGET.RequestMemory) >
1024 )
Or it might be better to do it outside start_cpu_jobs.
 Âstart_cpu_jobs = ((Memory - TARGET.RequestMemory) > 1024)
 ÂSTART = $(START) && ( DynamicSlot ?: $(start_cpu_jobs) )
Written this way, start_cpu_jobs is not evaluated for dynamic slots,
only for partitionable slots. It
controls the creation of dynamic slots while looking at the free
resources of the partitionable slot.
And actually there is another refinement, since RequestMemory is
rounded up to the next 128Â when used
you should really do this.
 Âstart_cpu_jobs = (Memory - quantize(TARGET.RequestMemory, {128}) )
> 1024
Note that the way DynamicSlot is used above, the START expression
won't work if you are not using partitionable slots in your
configuration.
If you are using STATIC slots, you would be better off just refusing
to match CPU jobs on the slots that have GPUs.
-tj
------------------------------------------------------------------------
*From:*ÂK._Scott Rowe <krowe@xxxxxxxx>
*Sent:*ÂTuesday, August 26, 2025 2:56 PM
*To:*ÂHTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
*Cc:*ÂJohn M Knoeller <johnkn@xxxxxxxxxxx>
*Subject:*ÂRe: [HTCondor-users] How to reserve resources for GPU jobs
{External}
Thanks. Your first suggestion that blocks all non-gpu jobs works. Your
second suggestion to allow some non-gpu jobs doesn't work.
Is the Memory variable in your example the amount of available memory on
the node? Because it seems to act more like the amount of memory
requested by the job. For example, if I add these two lines, simplified
from your suggestion, to my config
ÂÂ start_cpu_jobs = (Memory >= 1023)
ÂÂ START = $(START) && $(start_cpu_jobs)
and submit non-gpu a job asking for 1GB (request_memory = 1 G) of
memory, the job runs. But if I set
ÂÂ start_cpu_jobs = (Memory >= 1025)
ÂÂ START = $(START) && $(start_cpu_jobs)
and submit the same non-gpu job, it stays idle, even though "condor_q
-better" tells me there is 1 machine
able to run my job.
Thanks
I get just one return, when there are no jobs running
On 8/25/25 16:37, John M Knoeller via HTCondor-users wrote:
> If you want your machine that has GPUs to match only jobs that request
> GPUs, set
>
>
> Â Â START = (TARGET.RequestGPUs ?: 0) > 0
>
> This simplifies to
>
> Â Â START = (TARGET.RequestGPUs ?: 0)
>
> With the above START expression, only jobs that request at least 1 GPU
> will match. ÂThat's not quite what you asked for,
> but it shows the way. you just need the START expression to evaluate
> to false for cpu jobs while there is still memory
> and cpus available.
>
> I will show this using a temp variable to hold the CPU jobs expression.
>
> Â Âstart_cpu_jobs = (Cpus - TARGET.RequestCpus) >= 1 && (Memory -
> TARGET.RequestMemory) >= (128+1024)
> Â START = IfThenElse(TARGET.RequestGPUs ?: 0, true, $(start_cpu_jobs) )
>
> This simplifies to
>
> Â Â START = (TARGET.RequestGPUs ?: 0) || $(start_cpu_jobs)
>
> note that if you already have a START expression that is not just
> TRUE, this should be
>
> START = $(START) && ( (TARGET.RequestGPUs ?: 0) || $(start_cpu_jobs)Â )
>
> -tj
>
> ------------------------------------------------------------------------
> *From:*ÂHTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf
> of K._Scott Rowe <krowe@xxxxxxxx>
> *Sent:*ÂMonday, August 25, 2025 4:30 PM
> *To:*Âhtcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
> *Subject:*Â[HTCondor-users] How to reserve resources for GPU jobs
>
> Hey there. Imagine I have an EP running HTCondor-23.0.17 with 24 cores,
> 512GB RAM, and one GPU. There are many CPU-only jobs running on this EP
> for weeks at a time, and there are usually one or two GPU jobs as well.
> The CPU-only jobs may take weeks to finish, so sadly a GPU job may have
> to wait weeks to start. I would like GPU jobs to not have to wait so
> long.
>
> Is there a way I could reserve say 1 core and 128GB of RAM for GPU jobs,
> and only GPU jobs, on this EP thus letting CPU-only jobs continue to run
> on the other 23 cores and 384GB of RAM?
>
> I have been trying to do this with static slots but have not figured out
> how to make a slot that has the GPU as a resource and will NOT run
> CPU-only jobs.
>
> I should also mention that we don't use preemtion and really don't want
> to use it as it doesn't work well with our pipeline. I would also
> rather not ask our users to add a ClassAd to their submit scripts (e.g.
> +IsGPUJob), but if that is the only way, then so be it.
>
> Thanks
>
> --
>
> K. Scott Rowe -- Science Information Services
> Science Operations Center, National Radio Astronomy Observatory
> 1011 Lopezville Socorro, NM 87801
> krowe@xxxxxxxx -- 1.575.835.7193 --
>
https://urldefense.com/v3/__http://www.nrao.edu__;!!Mak6IKo!IshyrPRFTwy-zul-FivGEH-AsRP62e2ZafRLF_z6yc9_EYrjmi_JJ2eWbBMvgyT5eEmI2GcxD7UOAn13$
<https://urldefense.com/v3/__http://www.nrao.edu__;!!Mak6IKo!IshyrPRFTwy-zul-FivGEH-AsRP62e2ZafRLF_z6yc9_EYrjmi_JJ2eWbBMvgyT5eEmI2GcxD7UOAn13$>
>
<https://urldefense.com/v3/__http://www.nrao.edu__;!!Mak6IKo!IshyrPRFTwy-zul-FivGEH-AsRP62e2ZafRLF_z6yc9_EYrjmi_JJ2eWbBMvgyT5eEmI2GcxD7UOAn13$
<https://urldefense.com/v3/__http://www.nrao.edu__;!!Mak6IKo!IshyrPRFTwy-zul-FivGEH-AsRP62e2ZafRLF_z6yc9_EYrjmi_JJ2eWbBMvgyT5eEmI2GcxD7UOAn13$>>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
> with a
> subject: Unsubscribe
>
> The archives can be found at:
> https://www-auth.cs.wisc.edu/lists/htcondor-users/
<https://www-auth.cs.wisc.edu/lists/htcondor-users/>
> <https://www-auth.cs.wisc.edu/lists/htcondor-users/ >
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx
with a
> subject: Unsubscribe
>
> The archives can be found at:
https://www-auth.cs.wisc.edu/lists/htcondor-users/
<https://www-auth.cs.wisc.edu/lists/htcondor-users/>