Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] GPU selection in HTCondor 9.0.6 LT Release
- Date: Tue, 26 Apr 2022 01:02:12 +0000
- From: "Benjamin, Douglas" <dbenjamin@xxxxxxx>
- Subject: [HTCondor-users] GPU selection in HTCondor 9.0.6 LT Release
Hello,
We have several A100 GPU's that we have divided up using nVidia's MIG configuration. Each nVidia A100 80GB GPU is dived into 3 19.955GB partitions and 1 9.721 GB partion
Here is a snippet of the ïcondor_gpu_discovery command output.
MIG_3f63dad5_849f_591e_9d4f_f7bacd6c2d97DeviceName="NVIDIA A100 80GB PCIe MIG 2g.20gb"
MIG_3f63dad5_849f_591e_9d4f_f7bacd6c2d97DeviceUuid="MIG-3f63dad5-849f-591e-9d4f-f7bacd6c2d97"
MIG_3f63dad5_849f_591e_9d4f_f7bacd6c2d97DriverVersion=11.60
MIG_3f63dad5_849f_591e_9d4f_f7bacd6c2d97GlobalMemoryMb=19955
MIG_3f63dad5_849f_591e_9d4f_f7bacd6c2d97MaxSupportedVersion=11060
MIG_56476b2d_78a8_5280_9fa9_02bf5b74dee1DeviceName="NVIDIA A100 80GB PCIe MIG 1g.10gb"
MIG_56476b2d_78a8_5280_9fa9_02bf5b74dee1DeviceUuid="MIG-56476b2d-78a8-5280-9fa9-02bf5b74dee1"
MIG_56476b2d_78a8_5280_9fa9_02bf5b74dee1DriverVersion=11.60
MIG_56476b2d_78a8_5280_9fa9_02bf5b74dee1GlobalMemoryMb=9721
MIG_56476b2d_78a8_5280_9fa9_02bf5b74dee1MaxSupportedVersion=11060
We are using partitionable slots.
$CondorVersion: 9.0.6 Sep 23 2021 BuildID: racf PackageID: 9.0.6 $
$CondorPlatform: X86_64-ScientificLinux_7.9 $
Is there an easy way to add the GPUmemory to the requirements for a job. For users who have need for more memory than 9.721 GB we would like to allow the users to select.
Is there a condor classad short hand that would allow us to use *GlobalMemoryMb > 10000 to differential between GPU's.
Regards,
Doug Benjamin
Regards,
Doug Benjamin