[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Preemption



Hi!

At the present status it is working reasonably fine with latest stable version: 8.4.8.
Right now, jobs from X-users are never preempted, and jobs from  
non-X-users are preempted only when some activity is detected on that  
machine (ssh connection, mouse, keyboard, etc.). That more or less  
achieves the goal since X-users connect via ssh to this machine in  
order to submit their jobs, so at that moment ALL non-X-users jobs are  
killed.
Ideally, to maximize the use of this machine, we would like to preempt  
non-X-users jobs ONLY when cores are needed by X-users. For instance,  
if a X-user submit a job with request_cpus=4 and there are no free  
cores, it would be great to preempt one or several non-X-users jobs to  
get the required 4 cores... But I don't think that's easy to manage in  
partitionable slots...
I've seen in documentation an example about how to preempt jobs after  
a defined running time when another job with better priority is  
submitted, using RemoteUserPrio and SubmitterUserPrio (or  
SubmittorPrio, I've seen different names for that):
PREEMPTION_REQUIREMENTS = $(StateTimer) > (1 * $(HOUR)) &&  
RemoteUserPrio > SubmitterUserPrio * 1.2
but I don't know if it's possible to do something like that using  
usernames instead of priorities, so we can preempt jobs depending on  
who the Owner of the current running job is and who the Submitter is...
Thanks a lot for your help!

Best regards,





Quoting Brian Bockelman <bbockelm@xxxxxxxxxxx>:

Hi Antonio,

Nothing immediately jumps out - other than the fact that partitionable slots are not preemptible until a fairly recent release (might have to dig through the version notes to figure out when this happened).
You may additionally want to provide a negotiatior-based ranking  
expression to avoid preemption unless there are no other available  
machines.
What errors / behaviors are you seeing when running with this  
configuration?  Is the negotiator handing the schedd preempting  
matches for the otherwise-claimed slot?
Brian

On Aug 8, 2016, at 11:15 AM, Antonio Dorta <adorta@xxxxxx> wrote:

Hi!

In our HTCondor pool there is a special multi-core machine that belongs to a research group X. They want to run MPI parallel programs on that machine and right now that's properly working with partitionable slots and vanilla universe: all jobs submitted by users belonging to X (X-users) run immediately while there are still available cores (using request_cpus) and they are never preempted.
Since this group is not using that machine everyday, it could be  
interesting that other users would be able to use it for MPI or  
sequential programs, as long as they don't disturb X-users. So the  
idea is to run "non-X-users" jobs while there are available cores,  
but be able to preempt those other jobs as soon as X-users do a  
submission and there are no available cores... Is that possible in  
an easy way?
I've been reading documentation and trying some tests, but the  
results are not totally fine yet... So far I've been trying with  
some config files like the next one only on the special machine:
# Partitionable slot
SLOT_TYPE_1               = cpu=100%
SLOT_TYPE_1_PARTITIONABLE = TRUE
NUM_SLOTS_TYPE_1          = 1

X_USERS         = (Owner == "a" || Owner == "b" || ...)
RANK            = $(X_USERS) * 1000
START           = $(X_USERS) || $(START)
PREEMPT         = !$(X_USERS) && $(PREEMPT)
WANT_SUSPEND    = False

PREEMPTION_RANK = - (X_USERS * 10000) - TotalJobRunTime

Thanks a lot!!





--
Antonio Dorta
Servicios InformÃticos EspecÃficos (SIE)
InvestigaciÃn y EnseÃanza
Instituto de AstrofÃsica de Canarias (IAC)
C/ VÃa LÃctea, s/n. 38205 - La Laguna, Santa Cruz de Tenerife
Despacho: 1124. Tfno: 922 60 5278. email: adorta@xxxxxx
Supercomputing at IAC: http://www.iac.es/sieinvens/SINFIN/Main/supercomputing.php
----------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Proteccion de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

--
Antonio Dorta
Servicios InformÃticos EspecÃficos (SIE)
InvestigaciÃn y EnseÃanza
Instituto de AstrofÃsica de Canarias (IAC)
C/ VÃa LÃctea, s/n. 38205 - La Laguna, Santa Cruz de Tenerife
Despacho: 1124. Tfno: 922 60 5278. email: adorta@xxxxxx
Supercomputing at IAC: http://www.iac.es/sieinvens/SINFIN/Main/supercomputing.php
----------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Proteccion de Datos, acceda a http://www.iac.es/disclaimer.php WARNING: For more information on privacy and fulfilment of the Law concerning the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en