Hi, I just started to use condor for our cluster. The cluster has nodes with 8 cores and 4 GPU's each. I just managed to set-up a policy such that 4 slots are used for CPU+GPU computing and 4 for CPU only, basically with a START statement like START = (((SlotId < 5) && (TARGET.NeedGpu =?= TRUE)) || ((SlotId > 4) && \ (TARGET.NeedGpu =?= FALSE))) But I would also like to let a job consume more than one CPU (and/or GPU) at once. For this I thought to set up partitionable SLOTS like this SLOT_TYPE_1 = cpus=4 SLOT_TYPE_2 = cpus=4 NUM_SLOT_TYPE_1 = 1 NUM_SLOT_TYPE_2 = 1 SLOT_TYPE_1_PARTITIONABLE = TRUE SLOT_TYPE_2_PARTITIONABLE = TRUE But I cannot get this working: I always get 8 slots on each machine, no matter what I do. And each slot claims to have 1 cpu (from condor_status -l). And I cannot find the attributes PartitionableSlot or DynamicSlot using condor_status -l. I am using debian lenny with kernel Linux lattice01 2.6.26-2-amd64 #1 SMP Thu Sep 16 15:56:38 UTC 2010 x86_64 GNU/Linux with condor $CondorVersion: 7.4.4 Oct 13 2010 BuildID: 279383 $ $CondorPlatform: X86_64-LINUX_DEBIAN50 $ which I installed using the .deb packet from the condor webpage. Can someone tell me how to properly define the slots? Do I have to publish the types etc. somewhere, or should the lines quoted above be enough? Thanks Carsten For reference, here my full condor_config.local file, maybe the slots just don't go together with one of the other options: ## What machine is your central manager? CONDOR_HOST = $(FULL_HOSTNAME) ## Pool's short description COLLECTOR_NAME = Personal Condor at $(FULL_HOSTNAME) NUM_CPUS = 8 SLOT_TYPE_1 = cpus=4 SLOT_TYPE_2 = cpus=4 NUM_SLOT_TYPE_1 = 1 NUM_SLOT_TYPE_2 = 1 SLOT_TYPE_1_PARTITIONABLE = TRUE SLOT_TYPE_2_PARTITIONABLE = TRUE START = (((SlotId =?= 1) && (TARGET.NeedGpu =?= TRUE)) || ((SlotId =?= 2) && (TARGET.NeedGpu =?= FALSE))) SUSPEND = False CONTINUE = True PREEMPT = False KILL = False WANT_SUSPEND = False WANT_VACATE = False #RANK = Scheduler =?= $(DedicatedScheduler) ## This macro determines what daemons the condor_master will start ## and keep its watchful eyes on. ## The list is a comma or space separated list of subsystem names DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, STARTD ## Sets how often the condor_negotiator starts a negotiation cycle. ## It is defined in seconds and defaults to 60 (1 minute). NEGOTIATOR_INTERVAL = 20 ## Disable UID_DOMAIN check when submit a job TRUST_UID_DOMAIN = TRUE STARTD_ATTRS = $(STARTD_ATTRS) -- Carsten Urbach e-mail: curbach@xxxxxx urbach@xxxxxxxxxxxxxxxxx Fon : +49 (0) 228 73 2379 skype : carsten.urbach URL: http://www.carsten-urbach.eu
Attachment:
smime.p7s
Description: S/MIME cryptographic signature