[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] How is autoclustering supposed to work and how to influence it?



Yo Brian,

On 11 Dec 2024, at 19:16, Bockelman, Brian <BBockelman@xxxxxxxxxxxxx> wrote:

Hi Jeff,

I think the "good news" is that the particular symptom you describe cannot come from any autoclustering issues.  So, we can definitely go down the route of debugging your autoclustering setup (but it's not going to fix the described issue).  FWIW -- other than the most extreme scales (>500k), I can't think of any reason why you'd want to manually tweak autoclusters.

The reason I make the above statement is that, after the negotiator provides the match to the schedd process, both the schedd and startd will re-evaluate the requirements expressions in the job.  Hence, if the negotiator "gets it wrong", that's not sufficient to get the job started on the node.

I'd double-check the contents of the START _expression_ in the machine ad and the Requirements for the job to look for typos or other logical mistakes.

(Here, I'm assuming that you are implementing "should not be scheduled" via requirements; if you're using some other scheduling mechanism, let us know!)

The distribution to different nodes is done by the negotiator - there is no separate fence via requirements.

One negotiator says:

02-central-manager.config:NEGOTIATOR_DEPTH_FIRST = false

20-negotiator-constraint.config:NEGOTIATOR_SLOT_CONSTRAINT = ! ( regexp("wn-sate-079", Machine) || regexp("wn-lot", Machine) || regexp("wn-pijl", Machine) )

20-negotiator-constraint.config:NEGOTIATOR_JOB_CONSTRAINT = MaxWallTime <= 24*3600

The other says:

03-negotiator.config:DAEMON_LIST = MASTER NEGOTIATOR
03-negotiator.config:COLLECTOR_HOST_FOR_NEGOTIATOR = stbc-019.nikhef.nl
03-negotiator.config:NEGOTIATOR_DEPTH_FIRST = false
03-negotiator.config:NEGOTIATOR_INTERVAL = 179
03-negotiator.config:NEGOTIATOR_MIN_INTERVAL = 67
20-negotiator-constraint.config:NEGOTIATOR_SLOT_CONSTRAINT = regexp("wn-lot", Machine) || regexp("wn-pijl", Machine)
20-negotiator-constraint.config:NEGOTIATOR_JOB_CONSTRAINT = (MaxWallTime > 24*3600) || (time()-QDate > 90)


(Thereâs a third negotiator for wn-sate-079, but no jobs seem to get mis-scheduled there)

So the first negotiator should only accept jobs of less than 24 hours, and those can be scheduled on any machine thatâs not wn-sate-079 or belonging to the âlotâ or âpijlâ classes.

The second negotiator will accept longer jobs unconditionally, and shorter jobs as long as theyâve been queued for more that 90 seconds (to give the other negotiator a chance to schedule them to the âshortâ nodes).  Those jobs can go to any âlotâ or âpijlâ nodes that were excluded from the first negotiator.

I have test jobs that I submit, that are identical except that I know that given a certain parameter, some of them will take longer than 24 hours, so I feed them a different wall time via a â-appendâ on the condor_submit command.  Thatâs the only difference between the jobs, and in that case, if the long and short are submitted within seconds of each other, the long jobs wind up on the short node classes.  Adding a different memory request to short vs long jobs results in things working as intended. Hence my suspecting the autoclustering.  Your statement 

The reason I make the above statement is that, after the negotiator provides the match to the schedd process, both the schedd and startd will re-evaluate the requirements expressions in the job.  Hence, if the negotiator "gets it wrong", that's not sufficient to get the job started on the node.

So there are no requirements on the nodes or in the jobs, that would prevent a âgot it wrong at the negotiatorâ job from running.  If the negotiator gets it wrong, the job goes to the wrong place.

HTH,

JT