Unfortunately, HTCondor does not currently translate TARGET.CUDAComputeUnits to CUDA1ComputeUnits when AssignedGPUS is "CUDA1." You must do that yourself using configuration in the STARTD. either STARTD_ATTRS or some sort of STARTD_CRON script.
This is an area of active work in HTCondor. See this ticket
Which is requesting essentially the same thing that you just described. We plan to have a better solution for this problem in a future 9.1.x release.
-tj
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Martin Sajdl <masaj.xxx@xxxxxxxxx>
Sent: Tuesday, May 25, 2021 9:30 AM To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx> Subject: Re: [HTCondor-users] GPU benchmarking Hi Michael,
thank you again! To be honest, our nodes are configured in the way that there is as many slots as many GPUs are plugged in - each slot has just one GPU. So I think the tweak you mentioned is not needed there. But I wanted to just ensure, that I can use values like TARGET.CUDAComputeUnits in a job rank and it will be correctly translated to e.g. CUDA1ComputeUnits on the slot where AssignedGPUs="CUDA1". My classads for an example slot are below, each slot has just one GPU assigned, but CUDA* classads for both GPUs plugged in the node. AssignedGPUs = "CUDA1" CUDA0Capability = 7.5 CUDA0ClockMhz = 1695.0 CUDA0ComputeUnits = 34 CUDA0CoresPerCU = 64 CUDA0DeviceName = "GeForce RTX 2060 SUPER" CUDA0DevicePciBusId = "0000:01:00.0" CUDA0DeviceUuid = "5ffaf895-e943-8da2-23f4-d751418ba217" CUDA0DriverVersion = 11.2 CUDA0ECCEnabled = false CUDA0GlobalMemoryMb = 8192 CUDA0OpenCLVersion = 1.2 CUDA0RuntimeVersion = 10.2 CUDA1Capability = 7.5 CUDA1ClockMhz = 1695.0 CUDA1ComputeUnits = 34 CUDA1CoresPerCU = 64 CUDA1DeviceName = "GeForce RTX 2060 SUPER" CUDA1DevicePciBusId = "0000:02:00.0" CUDA1DeviceUuid = "d777aeb6-a721-c756-7075-9f19a3a54c2a" CUDA1DriverVersion = 11.2 CUDA1ECCEnabled = false CUDA1GlobalMemoryMb = 8192 CUDA1OpenCLVersion = 1.2 CUDA1RuntimeVersion = 10.2 Masaj On 5/25/2021 3:47 PM, Michael Pelletier via HTCondor-users wrote:
|