Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] manipulate ranking/priority of very-short-jobs-users
- Date: Thu, 17 Aug 2023 15:29:49 +0000
- From: "Luehring, Frederick C" <luehring@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] manipulate ranking/priority of very-short-jobs-users
Hey Y'all,
Is there a built-in method for condor to apply fair-share scheduling:
https://en.wikipedia.org/wiki/Fair-share_scheduling
The ATLAS Panda implementation does something along the lines of a fair-share
algorithm using numbers of jobs submitted instead of CPU. When a user who has
not submitted a job in over a week starts submitting new jobs, his/her jobs get
the highest user priority of 10000. As the user submits additional jobs they are
assigned lower and lower priority and I have seen users who submit gazillions of
jobs get down to negative priority below -5000. Eventually Panda will move the
user's jobs into a throttled state which is a sort of circuit breaker that
temporarily prevents the user's new jobs from starting. The user's submission
priority recovers because the incremental priority reduction caused by
previously submitted jobs is removed 7 days after the job submission. This sort
of approach seems like what is needed. The system could increase the priority of
a limited number short jobs to allow users who are not abusing the queuing
system to quickly run limited number of short test jobs when developing the code.
Fred
On 8/17/23 2:42 AM, Jeff Templon wrote:
> Thanks! I didnât know about this stuff.
>
>> On 16 Aug 2023, at 17:28, Todd L Miller via HTCondor-users <htcondor-users@xxxxxxxxxxx> wrote:
>>
>>> Another issue to take into account is that a high start rate can put pressure on other systems, like shared file systems.
>>
>> We already have a few throttles for high overall start rates.
>
> Usually the problem is not so much high overall start rates, here itâs usually one user who generates 90% of the high start rate. I really donât like making everyone suffer because of one clumsy user. OTOH the other users might let this user know how clumsy he/she is - peer communication tends to be effective.
>
> JT
>
>
> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
>
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/
--
Frederick Luehring Indiana U luehring@xxxxxx +1 812 855 1025 IU
http://cern.ch/Fred.Luehring Fred.Luehring@xxxxxxx +41 22 767 11 66 CERN