Hi,
With our multiple negotiator setup, weâre seeing weird instances of jobs that should not be scheduled to particular nodes, be scheduled there anyway. It looks to be the autoclustering - there are bunches of jobs being submitted, the only difference between them being a different ClusterId and a couple of different values for Nikhef-added custom attributes. Although AutoCluster claims to be using these different attributes, either it is not REALLY using them, or else the algorithm looks for âclose enoughâ instead of âidenticalâ, and then batches some jobs together that should not be scheduled to the same node set.
If I append a different request_memory to the one set vs the other, then they are scheduled correctly to the right nodes, RequestMemory being one of the âoriginalâ AutoCluster attributes.
There are variables SIGNIFICANT_ATTRIBUTES, ADD_SIGNIFICANT_ATTRIBUTES, REMOVE_SIGNIFICANT_ATTRIBUTES that appear to do something but never manage to achieve the effect desired namely, to not batch jobs together if they have different values for these custom attributes. There are also weird things happening, e.g. if I re-define one of those variables in the config and then do a condor_reconfig, there are remnants from the previous definition still hanging around, and some values never get removed, even if they are listed in the REMOVE variable.
How is this supposed to work?
JT
|