Hi all, I'm trying to setup condor to submit MPI jobs. If I understood correctly, I need to first setup a dedicated scheduler. I then checked the example "condor_config.local.dedicated.submit" file, but eveything is commented, so eventually I have "nothing" in this file (see attach.) I found this page : (http://www.openems.org/display/CONDOR/Ask+Mike ---> Optena), which says I should add something like : DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxx" STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler in the local condor_config file. So, which of these solution is the right one ? a mix of both ? so why is the example file empty ? For now, I didn't change anything on the dedicated scheduler config file, and added (and modified) the config file for one machine I wanted to use as dedicated resource : ################################# # Start only as EXECUTE machine DAEMON_LIST = MASTER, STARTD ##### Changes so that we don't care of KeyboardIdle START = ( $(CPUIdle) || (State != "Unclaimed" && \ State !="Owner") ) WANT_SUSPEND = ( $(SmallJob) || $(IsPVM) || $(IsVanilla) ) SUSPEND = ( (CpuBusyTime > 2 * $(MINUTE)) \ && $(ActivationTimer) > 90 ) CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) ) ## condor_config.local.dedicated.resource DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxx" ## 3) Always run dedicated jobs, but only allow non-dedicated jobs to ## run on an opportunistic basis. SUSPEND = Scheduler =!= $(DedicatedScheduler) && ($(SUSPEND)) PREEMPT = Scheduler =!= $(DedicatedScheduler) && ($(PREEMPT)) #RANK_FACTOR = 1000000 RANK_FACTOR = 100 RANK = (Scheduler =?= $(DedicatedScheduler) * $(RANK_FACTOR)) + $(RANK) START = (Scheduler =?= $(DedicatedScheduler)) || ($(START)) MPI_CONDOR_RSH_PATH = $(LIBEXEC) CONDOR_SSHD = /usr/sbin/sshd CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler ################# if I "ps ax|grep condor" this dedicated resource, I don't see any startd running (that I usually see on execute machines...) : $ ps ax|grep cond 24677 ? Ss 0:00 /nfs/opt/condor_x86_64/sbin/condor_master 24843 pts/0 S+ 0:00 grep cond And this "dedicated resource" just disappeared from my "condor_status" list Any idea to solve that ? I'm using condor 6.8.3 Thanks for your help Nicolas ---------------------------------------------------- CNRS - UPR 9080 : Laboratoire de Biochimie Theorique Institut de Biologie Physico-Chimique 13 rue Pierre et Marie Curie 75005 PARIS - FRANCE Tel : +33 158 41 51 70 Fax : +33 158 41 50 26 ----------------------------------------------------
Attachment:
condor_config.local.dedicated.submit
Description: Binary data