[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] lvm metadata archiving



[root@node37 ~]# vgs
  VG     #PV #LV #SN Attr   VSize    VFree
  condor   1   1   0 wz--n- <353.38g <353.37g
[root@node37 ~]# lvs LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy
%Sync Convert
slot1_1+1840_25 condor -wi-ao---- 8.00m
[root@node37 ~]# condor_config_val -dump LVM
# Configuration from machine: node37.ldas.ligo-la.caltech.edu

# Parameters with names that match LVM:
LVM_BACKING_FILE = $(SPOOL)/startd_disk.img
LVM_BACKING_FILE_SIZE_MB = 10240
LVM_HIDE_MOUNT = auto
LVM_THIN_LV_EXTRA_SIZE_MB = 2000 LVM_USE_THIN_PROVISIONING = false
LVM_VOLUME_GROUP_NAME = condor
# Contributing configuration file(s):
# /etc/condor/condor_config # /etc/condor/config.d/00-base # /etc/condor/config.d/00-htcondor-9.0.config # /etc/condor/config.d/00-logging # /etc/condor/config.d/10-cgroups # /etc/condor/config.d/10-jobmanagement # /etc/condor/config.d/10-quotas
#       /etc/condor/config.d/10-security
# /etc/condor/config.d/10-stash-plugin.conf # /etc/condor/config.d/20-batchnode
#       /etc/condor/config.d/20-misc
# /etc/condor/config.d/30-gpu-nongpunode # /etc/condor/config.d/90-secure-cvmfs-ligo-osg
#       /etc/condor/config.d/90-singularity-osg
#       /etc/condor/config.d/99-request-disk
#       /etc/condor/config.d/cpuinfo
#       /etc/condor/condor_config.local
# /etc/condor/config.d/00-base # /etc/condor/config.d/00-htcondor-9.0.config
#       /etc/condor/config.d/00-logging
#       /etc/condor/config.d/10-cgroups
# /etc/condor/config.d/10-jobmanagement
#       /etc/condor/config.d/10-quotas
# /etc/condor/config.d/10-security # /etc/condor/config.d/10-stash-plugin.conf # /etc/condor/config.d/20-batchnode # /etc/condor/config.d/20-misc # /etc/condor/config.d/30-gpu-nongpunode # /etc/condor/config.d/90-secure-cvmfs-ligo-osg
#       /etc/condor/config.d/90-singularity-osg
# /etc/condor/config.d/99-request-disk
#       /etc/condor/config.d/cpuinfo

[root@node37 ~]# grep 'StartD disk enforcement' /var/log/condor/StartLog
StartLog:02/19/25 14:12:58 StartD disk enforcement using Volume Group: condor StartLog:02/24/25 09:03:30 StartD disk enforcement using Volume Group: condor

This entries appears twice due to a reboot of the EP.

--Mike

On 2/24/25 11:54, Cole Bollig via HTCondor-users wrote:
Hi Michael,

That is interesting. Would you be willing to share the output of the following commands?

   *
sudo vgs
   *
sudo lvs
   *
condor_config_val -dump LVM

Also, can you locate the message "StartD disk enforcement using Volume Group:" in the Startd Log?

Thanks,
Cole Bollig
________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Michael Thomas <wart@xxxxxxxxxxx>
Sent: Monday, February 24, 2025 9:38 AM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] lvm metadata archiving

Hi Cole,

It looks like we have a case of the observer effect.  When I turn up the
logging with:

STARTD_DEBUG           = D_FULLDEBUG
STARTER_DEBUG          = D_FULLDEBUG

...then no new entires in /etc/lvm/archive are created, and I can see
that '--autobackup n' is being used:

StarterLog.slot1_1:02/24/25 09:23:45 Running: lvcreate --autobackup n -n
slot1_1
+3766_2 --addtag htcondor_lv --yes -L 2098176k condor

StarterLog.slot1_2:02/24/25 09:23:43 Running: lvcreate --autobackup n -n
slot1_2
+3766_3 --addtag htcondor_lv --yes -L 2098176k condor

StarterLog.slot1_3:02/24/25 09:23:44 Running: lvcreate --autobackup n -n
slot1_3
+3766_4 --addtag htcondor_lv --yes -L 2098176k condor


But when I reset them back to the original settings:

STARTD_DEBUG           = D_COMMAND
STARTER_DEBUG          = D_NODATE

...then the entries in /etc/lvm/archive are created again.

--Mike

On 2/21/25 16:06, Cole Bollig via HTCondor-users wrote:
Hi Michael,

If you turn on D_FULLDEBUG for the StartD and Starter you will see all of the commands with arguments that the HTCondor daemons (StartD and Starter) are running to manage LVM. You might be able to connect a command execution with a new archival file.

-Cole Bollig
________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Michael Thomas <wart@xxxxxxxxxxx>
Sent: Friday, February 21, 2025 3:49 PM
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] lvm metadata archiving

Hi Cole,

I guess I'm not seeing the desired behavior then, because
/etc/lvm/archive is still getting populated with new archive entries for
condor:

[root@node36 ~]# ls -l /etc/lvm/archive
total 60
-rw-------. 1 root root 1009 Feb 21 13:34 condor_00000-1638320359.vg
-rw-------. 1 root root 1009 Feb 21 13:34 condor_00001-460257242.vg
-rw-------. 1 root root  947 Feb 21 13:46 condor_00002-2021812796.vg
-rw-------. 1 root root 1889 Feb 21 13:46 condor_00003-698536540.vg
-rw-------. 1 root root 1864 Feb 21 14:08 condor_00004-1090285748.vg
-rw-------. 1 root root 2327 Feb 21 14:08 condor_00005-22124147.vg
-rw-------. 1 root root 2355 Feb 21 14:08 condor_00006-60270403.vg
-rw-------. 1 root root 2779 Feb 21 14:09 condor_00007-970492402.vg
-rw-------. 1 root root 2355 Feb 21 14:11 condor_00008-3312197.vg
-rw-------. 1 root root 2769 Feb 21 14:12 condor_00009-37119960.vg
-rw-------. 1 root root 2352 Feb 21 14:46 condor_00010-1036201810.vg
-rw-------. 1 root root 2354 Feb 21 14:46 condor_00011-1532942938.vg
-rw-------. 1 root root 2354 Feb 21 15:46 condor_00012-266977466.vg
-rw-------. 1 root root 2354 Feb 21 15:46 condor_00013-2078303505.vg
-rw-------. 1 root root 1750 Aug 22  2022 node36.1_00000-272466926.vg

[root@node36 ~]# rpm -q condor
condor-24.4.0-1.el8.x86_64

Is there a logging knob I can turn to help figure out why these are
still showing up?

--Mike

On 2/21/25 15:44, Cole Bollig via HTCondor-users wrote:
Hi Michael,

There is no configuration that needs to be added to enable the use of --autobackup n as this is hard coded into a lot of the commands HTCondor uses to manage the ephemeral LVs. One thing to note is that the archiving still occurs when an administrator comes through and manually runs specific LVM commands. HTCondor can only help reduce overloading the host with archives due to it's rapid LV creation/deletion.

Perhaps I should implement a Startd Cron to periodically remove all but the last N most recent volume group archives (of the VG associated with/used by HTCondor).

Cheers,
Cole Bollig
________________________________
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Michael Thomas <wart@xxxxxxxxxxx>
Sent: Friday, February 21, 2025 2:15 PM
To: condor-users@xxxxxxxxxxx <condor-users@xxxxxxxxxxx>
Subject: [HTCondor-users] lvm metadata archiving

The release notes for htcondor-24.4.0 reference a fix for HTCONDOR-2791:

"HTCondor should use the -A/--autobackup n option to prevent a new
backup being created and archived since HTCondor does lots of lvcreate
and lvremove."

But in my recently upgraded 24.4.0 cluster, this doesn't seem to be
having any effect.  New lvm archives are still being logged to
/etc/lvm/archive/

Is there some htcondor config setting I need to apply to disable this
metadata archiving, or is it supposed to be the default behavior?

--Mike
_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!J_HhNfdQ9ppZnUDArFcLbPTHaWTn0we3iwUZQ1WtlMZf6r4tj96YaKfgzmoP0nx7n78qRJwyEIxBNcLV$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!Oc9rCJnXo34-3vEwHmGTgR-pjQiEW1xhcjYAbFW5Z_qEjpsgm51HA46RTSVEBJMsPx0cBq5TIHw4JOZw$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!Oc9rCJnXo34-3vEwHmGTgR-pjQiEW1xhcjYAbFW5Z_qEjpsgm51HA46RTSVEBJMsPx0cBq5TIHw4JOZw$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!MwgtomIIiYdbNKEJEnZ6uR8nW06Gm27HiPV046wUYvCTilfcYGm7IoZ-IYBiTHWN5e2-tYPV8BXPEsUC$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!MwgtomIIiYdbNKEJEnZ6uR8nW06Gm27HiPV046wUYvCTilfcYGm7IoZ-IYBiTHWN5e2-tYPV8BXPEsUC$

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://osg-htc.org/htc25

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/