Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] cpu with possible gpu
- Date: Thu, 27 Oct 2016 09:27:52 -0500
- From: Todd Tannenbaum <tannenba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] cpu with possible gpu
On 10/27/2016 8:51 AM, Michael Di Domenico wrote:
it's my understanding that if i declare
request_cpus=1
request_gpus=1
that i will be granted a slot in my pool that has both a cpu and a gpu available
Correct, assuming of course that your execute nodes have
use feature: GPUs
in their condor_config (telling HTCondor that it should detect and
manage any GPUs on that node).
is there anyway to say, always give me a cpu, but if a gpu is
available give me one of those also and if a gpu isn't still allocate
the cpu slot
Yes. The key is that request_cpus, request_gpus, etc, are all just
ClassAd expressions.
So I believe you can do what you ask above as follows:
executable = foo.exe
request_cpus = 1
# Give my job 1 GPU if available
request_gpus = gpus > 0 ? 1 : 0
# Prefer a machine with a GPU if one is available
rank = gpus > 0
There is a HOWTO talking about this concept (although for CPUs, not
GPUs, but the ideas are the same) at
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToUseAllCpus
also in a separate vein, is there a simple way to allocate jobs by
host instead of by slot. meaning since we have dynamic slots,
machines have multiple slots. if i run a submit with queue of 16, can
i allocate one slot on sixteen nodes instead of potentially 16 slots
on one node. i think i can do this using a faux resource/license
manager, but i'm curious if there's another or better way
Is it a *requirement* that each job run on a different host, or is it
just a preference?
If it is a requirement, the only thing that comes immediately to mind is
to define a custom machine resource. I.e. in the condor_config on your
execute nodes have something like
MACHINE_RESOURCE_NAMES = $(MACHINE_RESOURCE_NAMES) OnlyOne
MACHINE_RESOURCE_OnlyOne = 1
then in your job submit file can look like
executable = foo.exe
request_cpus = 1
request_gpus = 1
request_onlyone = 1
If it is just a preference, perhaps the wisdom listed here would help:
https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToFillPoolBreadthFirst
Hope the above helps,
Todd