Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Tracking available memory on a compute host
- Date: Wed, 31 Jan 2018 16:37:28 -0500
- From: Steve Huston <huston@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Tracking available memory on a compute host
On Wed, Jan 31, 2018 at 12:43 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> What you can't do is tell HTCondor that it can have all of the memory and also let some other scheduler use all the memory
> and expect HTCondor to dynamically adjust its allocations to account for non-HTCondor memory usage.
I thought that was the whole point?
"All machines in the HTCondor pool advertise their resource
properties, both static and dynamic, such as *available RAM memory*,
CPU type, CPU speed, virtual memory size, physical location, and
current load average, in a resource offer ad." --
http://research.cs.wisc.edu/htcondor/manual/current/1_2HTCondor_s_Power.html
(emphasis mine)
Of course I could restrict the memory allowed for Condor, and I could
probably with the right settings restrict the available memory for
console (owner) usage to something so that Condor jobs always have
resources. But just like a CPU core can be used by the owner and then
free for Condor usage later, I would think RAM should be as well.
On Wed, Jan 31, 2018 at 3:29 PM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
> You could probably do something using a startd cron script to push a value
> into the slot ads the represents the amount of non-HTCondor memory usage,
> and then have the START expression refer to that value in order to prevent
> matches. There will be some delay between when the startd sees the updated
> value for non-HTCondor usage and when the Negotiator and Schedd see that
> value â so you will still probably get some jobs starting that then just OOM
> killed a little while later, but it wonât *keep* happening.
I suppose that's the route I'll have to take, if this becomes
problematic enough. So far it hasn't before, and I've been running a
Condor scheduler here for just over 13 years, so it might not be worth
the hassle. I was just confused that it didn't already exist, and
figured I was overlooking something simple.
Thanks all.
--
Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
Princeton University | ICBM Address: 40.346344 -74.652242
345 Lewis Library |"On my ship, the Rocinante, wheeling through
Princeton, NJ 08544 | the galaxies; headed for the heart of Cygnus,
(267) 793-0852 | headlong into mystery." -Rush, 'Cygnus X-1'