Dear Condor community, We have recently acquired some workernodes that will act as startd’s with many threads per host. Due to this large increase in threads per host, we’d like to create 2 dynamic partitions and, if possible, map the threads on CPU0 to partition
1 and the threads on CPU1 to partition 2. This will also have the benefit that x16 GMI links between the CPU’s aren’t bottlenecked. I currently have the following startd config: NUM_SLOTS = 2 NUM_SLOTS_TYPE_1 = 1 NUM_SLOTS_TYPE_2 = 1 SLOT1_EXECUTE = /pool_1 SLOT2_EXECUTE = /pool_2 SLOT_TYPE_1 = cpus=193,mem=50%,auto SLOT_TYPE_1_PARTITIONABLE = TRUE SLOT_TYPE_2 = cpus=193,mem=50%,auto SLOT_TYPE_2_PARTITIONABLE = TRUE With this config, I believe this will split the host resources between the two partitionable slots, however it would be preferable if there was some CPU affinity for SLOT1 to use threads on CPU0 and similar for SLOT2. Looking through the
documentation I can see the ClassAd of `SLOT<N>_CPU_AFFINITY` which looks to be exactly what I wish to do, however I notice the line
“This configuration variable is replaced by
ASSIGN_CPU_AFFINITY. Do not enable this configuration variable unless using glidein or another unusual setup.” Which makes me think this is not an optimal ClassAd to use for this use case and “ASSIGN_CPU_AFFINITY” appears to
be a Boolean with no option to define thread mappings to partitions. I imagine this is possible using some external script to do the mapping using a command such as ` lscpu -p=NODE,CPU` however I’m struggling to put all the pieces together. If anyone has any pointers or advise it would be gratefully received. Many thanks in advance, Thomas Birkett Senior Systems Administrator Scientific Computing Department Science and Technology Facilities Council (STFC) Rutherford Appleton Laboratory, Chilton, Didcot |