Ian Stokes-Rees wrote:
On 4/14/10 11:48 AM, Dan Bradley wrote:One likely source of trouble in this policy is that RANK is inherently a preemptive mechanism. RANK is only relevant when deciding whether to preempt an existing job with a new better-ranked one. This can lead to rapid cycles of preemption in some cases.I can reform the question as follows: What needs to be done to make sure that in each matching cycle idle job slots are considered first?
If I read your ticket correctly, the policy already should make sure that jobs are sent to idle slots if they are available:
NEGOTIATOR_PRE_JOB_RANK = (RemoteOwner =?= UNDEFINED) * SlotIDSo if it can be confirmed that there is an idle slot that matches the job but the negotiator is matching the job to some other slot that is claimed, then we'll need to examine that closely and understand why the pre job rank expression is not having the expected effect. If the negotiator had a stale view of the machine state (so it doesn't realize that a machine is claimed), that could lead to this sort of behavior. However, I see nothing in the configuration that would lead to that. Perhaps we'll need to look at the negotiator log to see what is going on.
What we think we see now is that matching is done against an arbitrary machine, whether it is idle or not, and the RANK expression means that no consideration is given to a running job, even when other idle nodes are available.
The machine RANK expression specifies which job the machine prefers. If the job matches to multiple machines, including some idle and some claimed, and NEGOTIATOR_PRE_JOB_RANK prefers to run the job on the idle machines, then the machine RANK expression should not matter.
--Dan