[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] What can hinder condor_startd to set DISK?



Hi,

On Fri, 2023-04-21 at 11:10:20 +0200, Thomas Hartmann wrote:
> Hi Steffen,
> 
> I guess
>   RESERVED_DISK = 131072
> might be the culprit. I just checked in the documentation and the ad is in
> MB, i.e., the reservation for non-condor stuff of ~131GB would surpass the
> available 124G on your /var (unfortunately, prefixes/sizes are sometimes a
> bit inconsistent)

Hm, my printed copy of the manual (10.0.0) must be wrong then. It has "(in kB)"
for both DISK and RESERVED_DISK - while the unit for RESERVED_SWAP is "MiB".

Also, matching is done by comparing TARGET.RequestDisk and DISK without any
unit conversions, so the JOB_DEFAULT_REQUESTDISK would be affected as well?

Previously I had a very low setting - which I'll restore now.

> 
> Another thing I noticed - is the execute directory on a dedicated volume?
> Else
>   STARTD_RECOMPUTE_DISK_FREE = false
> might be a problem in cases, where /var get filled by other processes (like
> logs) and the available disk space shrinks for jobs as well.

Since my partitionable slot gets only 75% of the total disk I'm not worried
about that, and there will be a watchdog checking for disk (partition)
shortages.

Thanks so far, I'll report about the outcome,
 Steffen

> 
> Cheers,
>   Thomas
> 
> On 21/04/2023 10.30, Steffen Grunewald wrote:
> > Good morning,
> > 
> > after setting up HTCondor 10.0.3 on our local cluster, I'm running into
> > issues related to disk space and requirements.
> > 
> > root@h0402:~# condor_config_val -dump -expand EXECUTE
> > # Configuration from machine: h0402.hypatia.local
> > 
> > # Parameters with names that match EXECUTE:
> > ENCRYPT_EXECUTE_DIRECTORY = false
> > ENCRYPT_EXECUTE_DIRECTORY_FILENAMES = false
> > EXECUTE = /var/lib/condor/execute
> > GANGLIAD_PER_EXECUTE_NODE_METRICS = true
> > LOCAL_UNIV_EXECUTE = /var/lib/condor/spool/local_univ_execute
> > # Contributing configuration file(s):
> > #       /etc/condor/condor_config
> > #       /etc/condor/condor_config_local|
> > root@h0402:~# df -h /var/lib/condor/execute
> > Filesystem      Size  Used Avail Use% Mounted on
> > /dev/sda5       125G  455M  124G   1% /var
> > root@h0402:~# condor_config_val -dump -expand DISK
> > # Configuration from machine: h0402.hypatia.local
> > 
> > # Parameters with names that match DISK:
> > CONSUMPTION_DISK = quantize(target.RequestDisk,{1024})
> > CREATE_LOCKS_ON_LOCAL_DISK = true
> > FILE_TRANSFER_DISK_LOAD_THROTTLE = 2.0
> > FILE_TRANSFER_DISK_LOAD_THROTTLE_LONG_HORIZON = 5m
> > FILE_TRANSFER_DISK_LOAD_THROTTLE_SHORT_HORIZON = 1m
> > FILE_TRANSFER_DISK_LOAD_THROTTLE_WAIT_BETWEEN_INCREMENTS = 60
> > JOB_DEFAULT_REQUESTDISK = 131072
> > LOCAL_DISK_LOCK_DIR =
> > MODIFY_REQUEST_EXPR_REQUESTDISK = quantize(RequestDisk,{1024})
> > RESERVED_DISK = 131072
> > SCHEDD_ROUND_ATTR_DiskUsage = 25%
> > STARTD_RECOMPUTE_DISK_FREE = false
> > # Contributing configuration file(s):
> > #       /etc/condor/condor_config
> > #       /etc/condor/condor_config_local|
> > root@h0402:~# condor_status -l `hostname`| grep ^Disk
> > Disk = 0
> > 
> > 
> > Since $(JOB_DEFAULT_REQUEST_DISK) > $(DISK) there's no way to run vanilla
> > universe jobs.
> > 
> > The manual, under DISK and RESERVED_DISK, suggests that the startd would
> > determine the amount of available space (of which there's plenty), but
> > for me obviously it doesn't. Is there a means to find out why?
> > 
> > Thanks, Steffen
> > 



> _______________________________________________
> HTCondor-users mailing list
> To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
> subject: Unsubscribe
> You can also unsubscribe by visiting
> https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users
> 
> The archives can be found at:
> https://lists.cs.wisc.edu/archive/htcondor-users/


-- 
Steffen Grunewald, Cluster Administrator
Max Planck Institute for Gravitational Physics (Albert Einstein Institute)
Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany
~~~
Fon: +49-331-567 7274
Mail: steffen.grunewald(at)aei.mpg.de
~~~