[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_gpu_discovery failing




On 6/7/25 7:40 PM, Vikrant Aggarwal wrote:
Hello Experts,

condor_gpu_discovery failing:

# ldd /usr/libexec/condor/condor_gpu_discovery
    linux-vdso.so.1 (0x00007ffc7edd1000)
    libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb71ee00000)
    libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb71f0be000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fb71ea00000)
    libm.so.6 => /lib64/libm.so.6 (0x00007fb71ed25000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fb71f111000)


#Â/usr/libexec/condor/condor_gpu_discovery
Error: cuInit returned 802
DetectedGPUs=0

A quick googling reveals that perhaps the NVLink software needs to be installed on this machine -- do you know if it is?

-greg