Hello,
Setup description:
Condor version 8.8.9
1.5k physical machines
18.9k cores
~200TB RAM
~50k jobs in queues
4 schedulers machines (16 cores 128GB RAM each)
Negotiator / Collector machine runs on 4 core 16GB RAM machine.
Condor configuration description:
Accounting groups quotas
Pslot preemption enabled
All partitionable slots are running within Docker universe
Issue description:
We experiencing a slow job matching rate ~15 per second when the cluster is ~50% idle.
Can anyone share their tips on how to improve this rate?
Thanks,
Zohar