Dear all,
On GPUs machines with HTCondor 24.0.7, although the GPUs are correctly recognized by HTCondor, no NVIDIA devices were visible when a job requesting GPUs starts.
I found that the configuration knob STARTER_HIDE_GPU_DEVICES is set to True by default since version 23.5.2. When I changed this setting to False, the job was able to see and use the assigned GPU as expected (or at least as it works on HTCondor 23.0.22).
According to the documentation, STARTER_HIDE_GPU_DEVICES is supposed to hide only the non-assigned GPUs from the job. Since the job is correctly submitted with a request_gpu and HTCondor is assigning one, I would expect the assigned GPU to remain visible even with this setting enabled.Â
Am I misunderstanding how STARTER_HIDE_GPU_DEVICES is supposed to work?
Best regards,
Carles
-- Carles Acosta i Silva
PIC (Port d'Informacià CientÃfica)
Campus UAB, Edifici D
E-08193 Bellaterra, Barcelona
Tel: +34 93 581 33 08
Fax: +34 93 581 41 10