[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] How to specify slot specific information into same machine class ad name?



Hi all,

for a new project, we are currently defining four slots per execute node to ensure GPU/CPU/NVMe usage does not cross inter-CPU boundaries, e.g.

ENFORCE_CPU_AFFINITY = True

NUM_SLOTS_TYPE_1                 = 1
SLOT_TYPE_1                      @=slot
 cpus=12
 ram=20%
 swap=0%
 GPUS = 1 : DevicePciBusId == "0000:2A:00.0"
@slot
SLOT_TYPE_1_PARTITIONABLE        = True
SLOT1_CPU_AFFINITY = 0,2,4,6,8,10,24,26,28,30,32,34
[...]

and we establish a fixed mapping between each such slots and a locally available data set which we simply "number" from 00..58.

We now want to create and start a DAG which contains a single job for each of these possible numbers and should user `requirement` to match to the proper slot on the proper target machine.

However, I'm not sure which way to achieve this easily.

First stop was

IDX1 = 10
IDX2 = 45
SLOT1_STARTD_ATTRS = IDX1
SLOT2_STARTD_ATTRS = IDX2

but of course this would inject IDX1 into slot1 and IDX2 into slot 2 which would make the requirement line somewhat lengthy (testing if any of IDX1/2/3/4 matches the wanted number).

I've not yet tried to use some logical argument for IDX as I'm not sure in which context it would be evaluated, if at all.

I don't think I could inject per slot values via STARTD_CRON, so how could I approach this?

Cheers

Carsten

--
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
CallinstraÃe 38, 30167 Hannover, Germany, Phone +49 511 762 17185

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature