[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Tracking available memory on a compute host



On Wed, Jan 31, 2018 at 12:43 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> What you can't do is tell HTCondor that it can have all of the memory and also let some other scheduler use all the memory
> and expect HTCondor to dynamically adjust its allocations to account for non-HTCondor memory usage.

I thought that was the whole point?

"All machines in the HTCondor pool advertise their resource
properties, both static and dynamic, such as *available RAM memory*,
CPU type, CPU speed, virtual memory size, physical location, and
current load average, in a resource offer ad." --
http://research.cs.wisc.edu/htcondor/manual/current/1_2HTCondor_s_Power.html
(emphasis mine)

Of course I could restrict the memory allowed for Condor, and I could
probably with the right settings restrict the available memory for
console (owner) usage to something so that Condor jobs always have
resources.  But just like a CPU core can be used by the owner and then
free for Condor usage later, I would think RAM should be as well.

On Wed, Jan 31, 2018 at 3:29 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> You could probably do something using a startd cron script to push a value
> into the slot ads the represents the amount of non-HTCondor memory usage,
> and then have the START expression refer to that value in order to prevent
> matches.   There will be some delay between when the startd sees the updated
> value for non-HTCondor usage and when the Negotiator and Schedd see that
> value â so you will still probably get some jobs starting that then just OOM
> killed a little while later, but it wonât *keep* happening.

I suppose that's the route I'll have to take, if this becomes
problematic enough.  So far it hasn't before, and I've been running a
Condor scheduler here for just over 13 years, so it might not be worth
the hassle.  I was just confused that it didn't already exist, and
figured I was overlooking something simple.

Thanks all.

-- 
Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
  Princeton University  |    ICBM Address: 40.346344   -74.652242
    345 Lewis Library   |"On my ship, the Rocinante, wheeling through
  Princeton, NJ   08544 | the galaxies; headed for the heart of Cygnus,
    (267) 793-0852      | headlong into mystery."  -Rush, 'Cygnus X-1'