Dear All,
We are going to configure a multi-core machine which has several GPUs
under condor. Our machine has 8 cores, and 4 GPUs. We want to configure
our machine with slot1, 2, 3, 4 dedicated for CPU jobs, and slot5, 6, 7, 8
dedicated for GPU jobs. Further, when user specifies +RequiresWholeCPUs,
the job will occupy all CPU slots, i.e., slot1, 2, 3, 4. When user specifies
+RequiresWholeGPUs, the job will occupy all GPU slots, i.e., slot5, 6, 7, 8.
In other words, we don't want a single job to occupy the whole machine.
Here is the local condor configuration we have tried, but it does not work:
(we installed condor-7.9.2)
==========================================================================
SLOT5_HAS_GPU = TRUE
SLOT5_GPU_DEV = 0
SLOT6_HAS_GPU = TRUE
SLOT6_GPU_DEV = 1
SLOT7_HAS_GPU = TRUE
SLOT7_GPU_DEV = 2
SLOT8_HAS_GPU = TRUE
SLOT8_GPU_DEV = 3
START = ($(START)) && \
((SlotID == 1 || TARGET.RequiresWholeCPUs =!= True) && \
(SlotID == 1 || Slot1_RequiresWholeCPUs =!= True))
START = ($(START)) || \
((SlotID == 5 || TARGET.RequiresWholeGPUs =!= True) && \
(SlotID == 5 || Slot5_RequiresWholeGPUs =!= True))
STARTD_JOB_EXPRS = $(STARTD_JOB_EXPRS) RequiresWholeCPUs RequiresWholeGPUs
SLOT1_STARTD_EXPRS = RequiresWholeCPUs
SLOT2_STARTD_EXPRS = RequiresWholeCPUs
SLOT3_STARTD_EXPRS = RequiresWholeCPUs
SLOT4_STARTD_EXPRS = RequiresWholeCPUs
SLOT5_STARTD_EXPRS = RequiresWholeGPUs
SLOT6_STARTD_EXPRS = RequiresWholeGPUs
SLOT7_STARTD_EXPRS = RequiresWholeGPUs
SLOT8_STARTD_EXPRS = RequiresWholeGPUs
==========================================================================
May I ask what's the correct way to configure condor for our desires ?
Thanks for your reply in advance.
Sincerely,
T.H.Hsieh