[HTCondor-devel] new name for MaxJobRetirementTime


Date: Wed, 06 Jun 2018 11:48:45 -0500 (CDT)
From: Todd L Miller <tlmiller@xxxxxxxxxxx>
Subject: [HTCondor-devel] new name for MaxJobRetirementTime
We've come to the conclusion that every word in MaxJobRetirementTime is wrong, so we'd like to rename it. To be precise, we must observe that there's a job ad attribute and a startd configuration knob with the same name. The job ad attribute can only shorten the duration specified by the startd configuration knob. Therefore, since the job is not informed when it enters vacating state, the only utility for the job ad attribute is match-making, meaning "don't bother to start me if you won't guarantee me if enough time to make forward progress."

I therefore propose that we call the job ad attribute "request_duration", since that's the only thing it can do.

The startd configuration knob should thus include the word duration. It can't be /just/ duration (unlike MEMORY), because (unlike MEMORY and NUM_CPUs), it's a conditional minimum, not a configured maximum. (The Miron directive for CHTC implies a knob called MAXIMUM_DURATION, but that's a different a problem.) Maybe call it ADVERTISED_DURATION?

Things that ignore MJRT currently include condor_reassign_slot(s) (which may be renamed to condor_now) and one (or more?) of the shutdown styles. A job may also not run for the MJRT because it didn't need that much time or because its policy expressions indicated it shouldn't. We agree that a job should also not run for the MJRT if it abuses the system (e.g., uses more memory than requested), but that's not currently implementable if the machine is draining. Because of these conditions, I feel we shouldn't name the knob 'promised' or 'minimum'.

- ToddM
[← Prev in Thread] Current Thread [Next in Thread→]