[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] condor_gpu_discovery failing



Hi Greg,

Thanks for your response.

Yes, I had already tried to install these packages after seeing some reportsÂon the internet, unfortunately no luck.Â

cuda-drivers-fabricmanager-570.124.06-1.x86_64
nvidia-fabric-manager-570.124.06-1.x86_64


Thanks & Regards,
Vikrant Aggarwal


On Mon, Jun 9, 2025 at 6:44âPM Greg Thain via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:

On 6/7/25 7:40 PM, Vikrant Aggarwal wrote:
> Hello Experts,
>
> condor_gpu_discovery failing:
>
> # ldd /usr/libexec/condor/condor_gpu_discovery
> Â Â Â Â linux-vdso.so.1 (0x00007ffc7edd1000)
> Â Â Â Â libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fb71ee00000)
> Â Â Â Â libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fb71f0be000)
> Â Â Â Â libc.so.6 => /lib64/libc.so.6 (0x00007fb71ea00000)
> Â Â Â Â libm.so.6 => /lib64/libm.so.6 (0x00007fb71ed25000)
> Â Â Â Â /lib64/ld-linux-x86-64.so.2 (0x00007fb71f111000)
>
>
> #Â/usr/libexec/condor/condor_gpu_discovery
> Error: cuInit returned 802
> DetectedGPUs=0
>
A quick googling reveals that perhaps the NVLink software needs to be
installed on this machine -- do you know if it is?

-greg

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/