Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Matching specific GPU model
- Date: Thu, 21 Mar 2019 16:43:39 +0000
- From: John M Knoeller <johnkn@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Matching specific GPU model
The expression (CUDACapability >= 1.2), will only match if the slot ad has an attribute named CUDACapability.
It is NOT matching CUDA[:digit:]Capability. Something else must be going on if the job is matching and running.
HTCondor will only create attributes with CUDA[:digit:] prefix when the value for that attribute is not the same
for all GPUs. In particular, this looks very weird to me
CUDA2DeviceName = "Tesla K10.G2.8GB"
CUDA3DeviceName = "Tesla K10.G2.8GB"
CUDA4DeviceName = "Tesla K10.G2.8GB"
CUDA5DeviceName = "Tesla K10.G2.8GB"
CUDA2DeviceName = "Tesla K10.G2.8GB"
These are all the same name, but CUDA1DeviceName is missing entirely!! What you should be getting is a single
attribute
CUDADeviceName = "Tesla K10.G2.8GB"
This indicates to me that something has gone badly wrong in during GPU detection.
-tj
-----Original Message-----
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> On Behalf Of Stuart Anderson
Sent: Thursday, March 21, 2019 11:33 AM
To: condor-users@xxxxxxxxxxx
Subject: [HTCondor-users] Matching specific GPU model
How do I specify a specific GPU model in a Condor 8.8 submit file?
The CUDACapability requirement example in the manual works for me,
requirements = (CUDACapability >= 1.2) && $(requirements:True)
http://research.cs.wisc.edu/htcondor/manual/v8.8/SubmittingaJob.html#x17-510002.5.12
However, what I am doing wrong with,
requirements = regexp("K10", TARGET.CUDADeviceName)
Here is part of condor_status -long from a random GPU node,
CUDA2DeviceName = "Tesla K10.G2.8GB"
CUDA3DeviceName = "Tesla K10.G2.8GB"
CUDA4DeviceName = "Tesla K10.G2.8GB"
CUDA5DeviceName = "Tesla K10.G2.8GB"
CUDA2DeviceName = "Tesla K10.G2.8GB"
CUDA0Capability = 5.0
CUDA1Capability = 5.0
CUDA2Capability = 3.0
CUDA3Capability = 3.0
CUDA4Capability = 3.0
More generally should all of the CUDA* attributes be able to match CUDA[:digit:]attribute (as works for CUDACapability)?
Thanks.
--
Stuart Anderson anderson@xxxxxxxxxxxxxxxx
http://www.ligo.caltech.edu/~anderson
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/