Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Quota per user?

Date: Tue, 21 Jul 2020 20:36:31 +0200
From: Carsten Aulbert <carsten.aulbert@xxxxxxxxxx>
Subject: [HTCondor-users] Quota per user?

Hi again,

somewhat related to my earlier question about defrag. We currently face
the problem, that we have a large number of jobs ranging from single
core to many cores and runtimes from a few minutes to more than a week.

And over time the pslots become so fragmented that even medium sized job
requests are starved for hours to days. As most of our jobs have
problems check pointing (too large memory footprints, usually takes too
long to write to disk, ....), I would try to avoid full scale preemption
for now.

Looking through the FAQ and recipes, it looks maybe I could (ab)use
dynamic group quotas and just create a group per user with a guaranteed
quota fraction of the pool including overflow so that only a small
subset of jobs are evicted if a user wanted to fill their quota.

However, given that we have quite a range of possible slot weights, I'm
not sure how condor would attribute quotas to user-groups[1].

Is this direction a possibility or are there better methods to get users
a foot into the door quickly?

Cheers

Carsten

[1]
condor_status -af SlotWeight|sort|uniq -c|sort -g
      1 48
      1 64
      1 7
      3 12
      5 128
      5 15
      5 4
      6 1
     19 8
    238 32
    380 0
   2596 16

-- 
Dr. Carsten Aulbert, Max Planck Institute for Gravitational Physics,
CallinstraÃe 38, 30167 Hannover, Germany
Phone: +49 511 762 17185

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Prev by Date: [HTCondor-users] Can we preferably defrag the largest machines?
Next by Date: [HTCondor-users] [HTCondor-Users] condor_job_queue.log corruption making condor_schedd die.
Previous by thread: [HTCondor-users] Can we preferably defrag the largest machines?
Next by thread: [HTCondor-users] [HTCondor-Users] condor_job_queue.log corruption making condor_schedd die.
Index(es):
- Date
- Thread

Mailing List Archives

Authenticated access

[HTCondor-users] Quota per user?