Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Translating GPU device assignments?
- Date: Thu, 06 Jul 2017 15:03:32 +0000
- From: Michael Pelletier <Michael.V.Pelletier@xxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Translating GPU device assignments?
A little bit of follow-up as I worked on this over the long weekend.
[Michael Pelletier]
So it turns out that the CUDA_VISIBLE_DEVICES=2,3 environment variable prompts the CUDA library to renumber the GPU ordinals for those devices to 0,1.
Thus in order to get the correct ordinals, you can't just use CUDA_VISIBLE_DEVICES or GPU_DEVICE_ORDINAL.
So it seems that the GPU_DEVICE_ORDINAL variable is being set incorrectly - when used in combination with CUDA_VISIBLE_DEVICES, it should be set to 0 through however many GPUs are requested.
I've worked around via:
GPU_ORDINAL = $CHOICE(REQGPU_INT, "error", "0", "0,1", "0,1,2", \
"0,1,2,3", "0,1,2,3,4", "0,1,2,3,4,5", "0,1,2,3,4,5,6", \
"0,1,2,3,4,5,6,7", "0,1,2,3,4,5,6,7,8", "too_many_gpus_requested")
And as I mentioned before, it'd be great to have this as a job attribute as well.
-Michael Pelletier.