Hi Nicolas...Last time, I had this problem... but I have resolved it with the next configuration
You need to configure that:In condor.config.local of your central manager (dedicated scheduler) write the next:
###################################################################### # DEDICATED SCHEDULER ###################################################################### ###################################################################### ###################################################################### ## Settings you MUST customize! ###################################################################### ###################################################################### ## What is the name of the dedicated scheduler for this resource? ## You MUST fill in the correct full hostname where you're running ## the dedicated scheduler, and where users will submit their ## dedicated jobs. The "DedicateScheduler@" part should not be ## changed, ONLY the hostname. DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxx" STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler ###################################################################### ###################################################################### ## Settings you should leave alone, but that must be defined ###################################################################### ###################################################################### ## Path to the special version of rsh that's required to spawn MPI ## jobs under Condor. WARNING: This is not a replacement for rsh, ## and does NOT work for interactive use. Do not use it directly! MPI_CONDOR_RSH_PATH = $(LIBEXEC) ## Path to OpenSSH server binary ## Condor uses this to establish a private SSH connection between execute ## machines. It is usually in /usr/sbin, but may be in /usr/local/sbin CONDOR_SSHD = /usr/sbin/sshd ## Path to OpenSSH keypair generator. ## Condor uses this to establish a private SSH connection between execute ## machines. It is usually in /usr/bin, but may be in /usr/local/bin CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen ## This setting puts the DedicatedScheduler attribute, defined above, ## into your machine's classad. This way, the dedicated scheduler ## (and you) can identify which machines are configured as dedicated ## resources. STARTD_EXPRS = $(STARTD_EXPRS), DedicatedSchedulerAnd in the execute nodes (dedicated resources), write in the condor_config.local
###################################################################### # DEDICATED RESOURCE ###################################################################### ###################################################################### ###################################################################### ## Settings you MUST customize! ###################################################################### ###################################################################### ## What is the name of the dedicated scheduler for this resource? ## You MUST fill in the correct full hostname where you're running ## the dedicated scheduler, and where users will submit their ## dedicated jobs. The "DedicateScheduler@" part should not be ## changed, ONLY the hostname. DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxx" STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler ###################################################################### ###################################################################### ## Settings you should leave alone, but that must be defined ###################################################################### ###################################################################### ## Path to the special version of rsh that's required to spawn MPI ## jobs under Condor. WARNING: This is not a replacement for rsh, ## and does NOT work for interactive use. Do not use it directly! MPI_CONDOR_RSH_PATH = $(LIBEXEC) ## Path to OpenSSH server binary ## Condor uses this to establish a private SSH connection between execute ## machines. It is usually in /usr/sbin, but may be in /usr/local/sbin CONDOR_SSHD = /usr/sbin/sshd ## Path to OpenSSH keypair generator. ## Condor uses this to establish a private SSH connection between execute ## machines. It is usually in /usr/bin, but may be in /usr/local/bin CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen ## This setting puts the DedicatedScheduler attribute, defined above, ## into your machine's classad. This way, the dedicated scheduler ## (and you) can identify which machines are configured as dedicated ## resources. STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler ##-------------------------------------------------------------------- ## 2) Always run jobs, but prefer dedicated ones ##-------------------------------------------------------------------- START = True SUSPEND = False CONTINUE = True PREEMPT = False KILL = False WANT_SUSPEND = False WANT_VACATE = False RANK = Scheduler =?= $(DedicatedScheduler)Next you must restart the "master" daemon in all nodes, with this command : condor restart -master
Other thing, your daemon list of execute nodes must be : DAEMON_LIST = MASTER, STARTD, *SCHEDD* I hope this help... PD: sorry for my english Nicolas GUIOT escribió:
Hi all, I'm trying to setup condor to submit MPI jobs. If I understood correctly, I need to first setup a dedicated scheduler.I then checked the example "condor_config.local.dedicated.submit" file, but eveything is commented, so eventually I have "nothing" in this file (see attach.)I found this page : (http://www.openems.org/display/CONDOR/Ask+Mike ---> Optena), which says I should add something like :DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxx" STARTD_EXPRS = $(STARTD_EXPRS), DedicatedSchedulerin the local condor_config file. So, which of these solution is the right one ? a mix of both ? so why is the example file empty ? For now, I didn't change anything on the dedicated scheduler config file, and added (and modified) the config file for one machine I wanted to use as dedicated resource :################################## Start only as EXECUTE machine DAEMON_LIST = MASTER, STARTD ##### Changes so that we don't care of KeyboardIdle START = ( $(CPUIdle) || (State != "Unclaimed" && \ State !="Owner") ) WANT_SUSPEND = ( $(SmallJob) || $(IsPVM) || $(IsVanilla) ) SUSPEND = ( (CpuBusyTime > 2 * $(MINUTE)) \ && $(ActivationTimer) > 90 ) CONTINUE = ( $(CPUIdle) && ($(ActivityTimer) > 10) ) ## condor_config.local.dedicated.resource DedicatedScheduler = "DedicatedScheduler@xxxxxxxxxxxxxxxx" ## 3) Always run dedicated jobs, but only allow non-dedicated jobs to ## run on an opportunistic basis. SUSPEND = Scheduler =!= $(DedicatedScheduler) && ($(SUSPEND)) PREEMPT = Scheduler =!= $(DedicatedScheduler) && ($(PREEMPT)) #RANK_FACTOR = 1000000 RANK_FACTOR = 100 RANK = (Scheduler =?= $(DedicatedScheduler) * $(RANK_FACTOR)) + $(RANK) START = (Scheduler =?= $(DedicatedScheduler)) || ($(START)) MPI_CONDOR_RSH_PATH = $(LIBEXEC) CONDOR_SSHD = /usr/sbin/sshd CONDOR_SSH_KEYGEN = /usr/bin/ssh-keygen STARTD_EXPRS = $(STARTD_EXPRS), DedicatedScheduler #################if I "ps ax|grep condor" this dedicated resource, I don't see any startd running (that I usually see on execute machines...) : $ ps ax|grep cond 24677 ? Ss 0:00 /nfs/opt/condor_x86_64/sbin/condor_master 24843 pts/0 S+ 0:00 grep condAnd this "dedicated resource" just disappeared from my "condor_status" list Any idea to solve that ? I'm using condor 6.8.3 Thanks for your help Nicolas ---------------------------------------------------- CNRS - UPR 9080 : Laboratoire de Biochimie Theorique Institut de Biologie Physico-Chimique 13 rue Pierre et Marie Curie 75005 PARIS - FRANCE Tel : +33 158 41 51 70 Fax : +33 158 41 50 26 ----------------------------------------------------------------------------------------------------------------------------_______________________________________________ Condor-users mailing list To unsubscribe, send a message to condor-users-request@xxxxxxxxxxx with a subject: Unsubscribe You can also unsubscribe by visiting https://lists.cs.wisc.edu/mailman/listinfo/condor-users The archives can be found at either https://lists.cs.wisc.edu/archive/condor-users/ http://www.opencondor.org/spaces/viewmailarchive.action?key=CONDOR
-- Ana Silva GallegoSistemas Centro Informático Científico de Andalucía (CICA) Avda. Reina Mercedes s/n - 41012 - Sevilla (Spain) Tfno.: +34 955 056 600 / +34 955 056 632 / FAX: +34 955 056 650
Consejería de Innovación, Ciencia y Empresa Junta de Andalucía --------------------------------------------------- Este mensaje esta firmado digitalmente. Para poder reconocer la firma desde su cliente debera tener instalado el certificado raiz de la CA del CICA en el mismo. Puede descargarlo desde: http://pki.cica.es/cacert/ ---------------------------------------------------
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature