Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Condor-users] Submitting a parallel job in the vanilla universe
- Date: Tue, 20 Jan 2009 07:56:19 -0600
- From: Matthew Farrellee <matt@xxxxxxxxxx>
- Subject: Re: [Condor-users] Submitting a parallel job in the vanilla universe
Brandon Leeds wrote:
Matthew Farrellee wrote:
Cargnelli Matthieu wrote:
Hi,
I'd like to know if it is possible to schedule a "vanilla" job which
is multithreaded, on a single multi-processor machine.
As I use the usual configuration (1 VM for 1 processor), I can use
the totalCpus parameter in the job description file
(Requirements=(totalCpus>=4) ) for instance. But then only one VM is
reserved for the job. I suppose this method will work if a single job
is submitted as in my tests, but what if 4 jobs are submitted ?
I couldn't find an answer aside from using the parallel universe. is
it possible to reserve a full node with a vanilla job ?
Best regards
Dan's HOWTO covers this under "How to allow some jobs to claim the
whole machine instead of one slot" - http://nmi.cs.wisc.edu/node/1482
Best,
matt
I tried accessing this document because I thought it might have some
bearing on whether it would be possible to use Condor to submit OpenMP
based programs, however it requires an NMI login which I don't have. Is
there another place I can access this document?
Thanks,
Brandon Leeds
Lehigh University
Reproduced below... (direct cut&paste from Dan's How-to Recipies)
--
How to allow some jobs to claim the whole machine instead of one slot
Known to work with Condor version: 7.2
The simplest way to achieve this is to simply set NUM_CPUS=1 so that
each machine just advertises a single slot. However, this prevents you
from supporting a mix of single-cpu and whole-machine jobs. The
following example achieves the goal of supporting both in all but one
respect: the Condor accountant does not charge the whole-machine user
for claiming all of the slots: it only charges the user for claiming one
slot.
First, you would have whole-machine jobs advertise themselves as such
with something like the following in the submit file:
+RequiresWholeMachine = True
Then put the following in your Condor configuration file. Make sure it
either comes after the other attributes that this appends to (such as
START) or that you merge the definitions together.
#require that whole-machine jobs only match to Slot1
START = ($(START)) && (TARGET.RequiresWholeMachine =!= TRUE || SlotID == 1)
# have the machine advertise when it is running a whole-machine job
STARTD_JOB_EXPRS = $(STARTD_JOB_EXPRS) RequiresWholeMachine
# Export the job expr to all other slots
STARTD_SLOT_EXPRS = RequiresWholeMachine
# require that no single-cpu jobs may start when a whole-machine job is
running
START = ($(START)) && (SlotID == 1 || Slot1_RequiresWholeMachine =!= True)
# suspend existing single-cpu jobs when there is a whole-machine job
SUSPEND = ($(SUSPEND)) || (SlotID != 1 && Slot1_RequiresWholeMachine =?=
True)
Instead of suspending the single-cpu jobs while the whole-machine job
runs, you could suspend the whole-machine job while the single-cpu jobs
finish. Example:
# advertise the activity of each slot into the ads of the other slots,
# so the SUSPEND expression can see it
STARTD_SLOT_EXPRS = $(STARTD_SLOT_EXPRS) Activity
# Suspend the whole-machine job until the other slots are empty
SUSPEND = ($(SUSPEND)) || (SlotID == 1 && Slot1_RequiresWholeMachine =?=
True && \
(Slot2_Activity =?= "Busy" || Slot3_Activity =?= "Busy" ||
... ) )
You might want to steer whole-machine jobs towards machines that are
completely vacant, especially on the slots only for single-cpu jobs.
Here's a simple example that just avoids machines with a high load:
NEGOTIATOR_PRE_JOB_RANK = -TARGET.LoadAvg*(MY.RequiresWholeMachine =?= True)
A more complicated expression would look at the attributes of the other
slots when forming the rank:
STARTD_SLOT_EXPRS = $(STARTD_SLOT_EXPRS) Activity
NEGOTIATOR_PRE_JOB_RANK = (MY.RequiresWholeMachine =?= True) * \
(Slot2_Activity =!= "Busy" + Slot3_Activity
=!= "Busy" + ... )