[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Condor-users] Avoiding preemption ping-pong



On 04/25/2012 05:04 PM, Erik Erlandson wrote:
On Wed, 2012-04-25 at 11:37 +0200, Martin Billinger wrote:
Hi there,

I have recently observed the jobs of two users keep preempting each
other every few hours.

We are running a small condor pool with some dedicated machines.
Two users have submitted a large number of long running jobs. Both users
have roughly the same priority.

Now, what happens is that first one of the users gets assigned all
available resources. Once EUPs differ by 20%, all of that users' jobs
are preempted and the other user gets all resources. This repeats every
few hours and causes many cycles to be lost.

You can change that '20%' threshold by editing this:
UWCS_PREEMPTION_REQUIREMENTS = ( $(StateTimer)>  (1 * $(HOUR))&&
RemoteUserPrio>  TARGET.SubmitterUserPrio * 1.2 ) || (MY.NiceUser ==
True)

So, you could increase '1.2' to something like 1.5, or 2.0, etc.

Thanks for your answer, Erik!

I increased the threshold to 2.0 and observed the pool's behavior for several days. The time between users preempting each other increased, causing less cycles to be lost, which is an improvement.

However, the core of my problem still remains: Once the preemption requirements are met, the user with the higher priority monopolizes the entire pool.

Is it possible to limit preemption such that each user gets his fair share of the pool?

Thanks in advance,
Martin