If the GPU is not working or for some reason you want it to not be used. then add its id to the OFFLINE_GPUS configuration
knob.
If you want to control which GPUs are bound to which slots at config time, the manual describes how in the section on
configuring gpus.
NUM_SLOTS_TYPE_2 = 1
SLOT_TYPE_2 @=slot
GPUs = 1 : "GPU-6a96bd13"
CPUs = 1
Memory = auto
@slot
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
Sent: Thursday, February 6, 2025 6:13 AM To: HTCondor Users Mailinglist <htcondor-users@xxxxxxxxxxx> Subject: [HTCondor-users] GPU sort order, Re: Make only one GPU available to HTCondor? Good morning/afternoon/...,
for actual reasons, I've got to dig this out - there had been no responses last year: On Mon, 2024-07-29 at 16:22:42 +0200, Steffen Grunewald wrote: > Hi all, > > the subject says it: We want to make a single GPU of a particular machine > available to HTCondor. How do I select a specific ID, or just the last in > the "DetectedGPUs" list created by condor_gpu_discovery? It turned out that HTCondor doesn't seem to obey the order in DetectedGPUs (which is in sync with the one returned by `nvidia-smi` which in turn seems to be the same as `gpustat` output), it will instead order the GPUs by their UUIDs (at least if the model is the same?). This makes a huge difference when assigning e.g. 7 GPUs to a disabled slot and the remaining one to an active one: in our case, it was GPU #4 (out of #0..#7, not the one users would see as #7) that was used, much to ther surprise of both the non-HTCondor and the HTCondor user when the clash occurred. Can this be avoided, i.e., can I select a GPU (by its UUID, or bus ID, or anything else) to "put into a slot"? Is there a means to modify HTCondor's indexing of GPUs, e.g. to just follow the order provided in DetectedGPUs? Thanks, Steffen -- Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany ~~~ Fon: +49-331-567 7274 Mail: steffen.grunewald(at)aei.mpg.de ~~~ _______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/ |