Hi all,
I'm trying to setup condor to submit MPI jobs. If I understood correctly, I need to first setup a dedicated scheduler.
I then checked the example "condor_config.local.dedicated.submit" file, but eveything is commented, so eventually I have "nothing" in this file (see attach.)
I found this page : (http://www.openems.org/display/CONDOR/Ask+Mike ---> Optena), which says I should add something like :
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxx"
STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler
in the local condor_config file.
So, which of these solution is the right one ? a mix of both ? so why is the example file empty ?
For now, I didn't change anything on the dedicated scheduler config file, and added (and modified) the config file for one machine I wanted to use as dedicated resource :
#################################
# Start only as EXECUTE machine
DAEMON_LIST = MASTER, STARTD
##### Changes so that we don't care of KeyboardIdle
START = ( $(CPUIdle) || (State != "Unclaimed" && \
State !="Owner") )
WANT_SUSPEND = ( $(SmallJob) || $(IsPVM) || $(IsVanilla) )
SUSPEND = ( (CpuBusyTime > 2 * $(MINUTE)) \
&& $(ActivationTimer) > 90 )
CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) )
## condor_config.local.dedicated.resource
DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxx"
## 3) Always run dedicated jobs, but only allow non-dedicated jobs to
## run on an opportunistic basis.
SUSPEND = Scheduler =!= $(DedicatedScheduler) && ($(SUSPEND))
PREEMPT = Scheduler =!= $(DedicatedScheduler) && ($(PREEMPT))
#RANK_FACTOR = 1000000
RANK_FACTOR = 100
RANK = (Scheduler =?= $(DedicatedScheduler) * $(RANK_FACTOR)) + $(RANK)
START = (Scheduler =?= $(DedicatedScheduler)) || ($(START))
MPI_CONDOR_RSH_PATH = $(LIBEXEC)
CONDOR_SSHD = /usr/sbin/sshd
CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen
STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler
#################
if I "ps ax|grep condor" this dedicated resource, I don't see any startd running (that I usually see on execute machines...) :
$ ps ax|grep cond
24677 ? Ss 0:00 /nfs/opt/condor_x86_64/sbin/condor_master
24843 pts/0 S+ 0:00 grep cond
And this "dedicated resource" just disappeared from my "condor_status" list
Any idea to solve that ?
I'm using condor 6.8.3
Thanks for your help
Nicolas
----------------------------------------------------
CNRS - UPR 9080 : Laboratoire de Biochimie Theorique
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie
75005 PARIS - FRANCE
Tel : +33 158 41 51 70
Fax : +33 158 41 50 26
----------------------------------------------------
Attachment:
condor_config.local.dedicated.submit
Description: Binary data