Hello Experts,
condor_gpu_discovery failing:Â
# ldd /usr/libexec/condor/condor_gpu_discovery
    linux-vdso.so.1 (0x00007ffc7edd1000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb71ee00000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb71f0be000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fb71ea00000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fb71ed25000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fb71f111000)
#Â/usr/libexec/condor/condor_gpu_discovery
Error: cuInit returned 802
DetectedGPUs=0
This machine has 8Â H200 SXM GPUs
>>> import torch; print(torch.cuda.device_count())
8
nvidia-smi works without any issue.
NVIDIA-SMI 570.124.06 Â Â Â Â Â Â Driver Version: 570.124.06 Â Â CUDA Version: 12.8
Any input is highly appreciated.
Thanks & Regards,
Vikrant Aggarwal