Tim, Dimitri, thanks a lot for the links, I think I found the answer - partitionable slots + condor_defrag. But still have some follow up questions.
Tim, I'll try to give you a specific example to describe the use case:
For example in my cluster I have a single partitionable slot with 10 CPUs.
Then I have 2 types of jobs - Job A requires 10 CPUs to run while Job B requires 1 CPU to run.
I submit several thousands of Job B and one Job A. There is a potential starvation problem for Job A cause there may be never all 10 CPUs available to run it even if the job priority is higher.
HTCondor offers condor_defrag daemon that would periodically drain machines. What concerns me about it is the periodical fashion of draining and not clear algorithm of choosing machines to do that. What if I don't have any big jobs currently. waiting for multi core slot, why would I need draining at all? Or what if the only machine that is suitable for a particular job never gets picked for draining?
So the algorithm on how defrag works is not transparent and it seems that the daemon won't solve the starvation problem in some cases...
Please advise,
Thank you,
Dmitry