Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] What can hinder condor_startd to set DISK?
- Date: Fri, 21 Apr 2023 13:31:33 +0200
- From: Steffen Grunewald <steffen.grunewald@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] What can hinder condor_startd to set DISK?
On Fri, 2023-04-21 at 11:18:17 +0200, Steffen Grunewald wrote:
> Hi,
>
> On Fri, 2023-04-21 at 11:10:20 +0200, Thomas Hartmann wrote:
> > Hi Steffen,
> >
> > I guess
> > RESERVED_DISK = 131072
> > might be the culprit. I just checked in the documentation and the ad is in
> > MB, i.e., the reservation for non-condor stuff of ~131GB would surpass the
> > available 124G on your /var (unfortunately, prefixes/sizes are sometimes a
> > bit inconsistent)
>
> Hm, my printed copy of the manual (10.0.0) must be wrong then. It has "(in kB)"
> for both DISK and RESERVED_DISK - while the unit for RESERVED_SWAP is "MiB".
>
> Also, matching is done by comparing TARGET.RequestDisk and DISK without any
> unit conversions, so the JOB_DEFAULT_REQUESTDISK would be affected as well?
>
> Previously I had a very low setting - which I'll restore now.
>
> >
> > Another thing I noticed - is the execute directory on a dedicated volume?
> > Else
> > STARTD_RECOMPUTE_DISK_FREE = false
> > might be a problem in cases, where /var get filled by other processes (like
> > logs) and the available disk space shrinks for jobs as well.
>
> Since my partitionable slot gets only 75% of the total disk I'm not worried
> about that, and there will be a watchdog checking for disk (partition)
> shortages.
>
> Thanks so far, I'll report about the outcome,
.... and here it is, from a different node though.
I have set
RESERVED_DISK = 128
and the /var filesystem reports 129177320 kB free.
>From "condor_status -l ... | grep Disk" I get
TotalDisk = 129046416
Disk = 96784812
- the latter being exactly 75% of the total space, as configured.
The difference between the free capacity and the TotalDIsk value is 130904,
which is close to 131072 (but not identical), meaning that RESERVED_DISK
is indeed multiplied by 1024 to get MB (the same as RESERVED_SWAP), and the
entry in my 10.0.0 manual (subsection 4.5.1, p.209) is wrong - but has been
fixed in the online version for 10.0.3. Lesson learned...
(BTW a negative value would have given me a wink to look closer...)
I'm now trying to get a grip on =?=/=!= expressions and a means to extend
MOUNT_UNDER_SCRATCH (for the latter, "$(MOUNT_UNDER_SCRATCH),/something/else"
will produce unexpected results), but the major issue is fixed it seems.
Thanks for your suggestions!
- Steffen
--
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~