Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Tracking available memory on a compute host
- Date: Tue, 30 Jan 2018 11:58:47 -0500
- From: Steve Huston <huston@xxxxxxxxxxxxxxxxxxx>
- Subject: Re: [HTCondor-users] Tracking available memory on a compute host
Is there no way to have condor daemons monitor the actual available
memory on a host and allow classads to be matched against it to ensure
jobs don't flock to a host without enough free RAM?
On Mon, Jan 22, 2018 at 1:43 PM, Steve Huston
<huston@xxxxxxxxxxxxxxxxxxx> wrote:
> I found a couple old mentions that "VirtualMemory" and/or
> "TotalVirtualMemory" are updated as a machine runs, and one might be
> able to use that to make sure there's enough memory available on a
> host to run jobs. However in my experimenting I found it was not
> updated nearly often enough to be useful - I gobbled up half the
> memory on a machine and the number wasn't changed even 15 minutes
> later, though there were updated classads received from it (and I was
> querying it directly anyway).
>
> This comes up because I had a user who had queued jobs that kept
> flocking to another user's machine where there were available cores,
> but no available memory (local usage, outside of HTCondor). Those
> queued jobs kept getting killed by oom_killer shortly after starting,
> but then new jobs would flock there. Thus, I'm looking for some way
> to add to the requirements test of a job that the host in question has
> enough free virtual memory to run the job.
>
> --
> Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
> Princeton University | ICBM Address: 40.346344 -74.652242
> 345 Lewis Library |"On my ship, the Rocinante, wheeling through
> Princeton, NJ 08544 | the galaxies; headed for the heart of Cygnus,
> (267) 793-0852 | headlong into mystery." -Rush, 'Cygnus X-1'
--
Steve Huston - W2SRH - Unix Sysadmin, PICSciE/CSES & Astrophysical Sci
Princeton University | ICBM Address: 40.346344 -74.652242
345 Lewis Library |"On my ship, the Rocinante, wheeling through
Princeton, NJ 08544 | the galaxies; headed for the heart of Cygnus,
(267) 793-0852 | headlong into mystery." -Rush, 'Cygnus X-1'