Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[HTCondor-users] Failed to run hibernation plugin
- Date: Wed, 8 Nov 2023 14:24:14 +0000
- From: Justin Killebrew <jk@xxxxxxx>
- Subject: [HTCondor-users] Failed to run hibernation plugin
Hello. Iâm wrestling with power management on Ubuntu 22.04. The execution point StartLog shows this error:
11/08/23 08:30:44 ResMgr: This machine is about to enter hibernation
11/08/23 08:30:44 Failed to run hibernation plugin '/usr/libexec/condor/condor_power_state set S3': status = 63744
Hibernation is supported:
/usr/libexec/condor$ sudo ./condor_power_state ad
HibernationMethod = "/sys"
HibernationRawMask = 28
HibernationSupportedStates = "S3,S4,S5"
I can suspend/hibernate from the command line using:
$ sudo systemctl hibernate
But condor_power_state fails for both S3 and S4:
/usr/libexec/condor$ sudo ./condor_power_state -d set s4
11/08/23 08:24:55 LinuxHibernator: Error writing 'disk' to '/sys/power/state': Input/output error
condor_power_state: failed to switch the machine's power state.
Hereâs the relevant portion of the EP config:
# Power management
WOL_SUPPORTED = TRUE
HIBERNATE_CHECK_INTERVAL = 20
TimeToWait = 120
ShouldHibernate = ( (State == "Unclaimed") \
&& ($(StateTimer) > $(TimeToWait)) \
&& ($(WOL_SUPPORTED)))
HibernateState = "RAM"
HIBERNATE = ifThenElse( $(ShouldHibernate), $(HibernateState), "NONE" )
The central manager seems to be correct:
RoosterLog:
11/08/23 06:21:00 Will perform unhibernate checks every ROOSTER_INTERVAL=180 seconds.
And the relevant CM config:
# Rooster wakes nodes up
DAEMON_LIST = COLLECTOR, MASTER, NEGOTIATOR, SCHEDD, ROOSTER, SHARED_PORT
COLLECTOR_PERSISTENT_AD_LOG = /var/log/condor/PersistentAdLog
ABSENT_REQUIREMENTS = ( (HibernationLevel?:0) == 0 )
EXPIRE_INVALIDATED_ADS = True
CLASSAD_LIFETIME = 900
# 604800s is 7 days
ABSENT_EXPIRE_ADS_AFTER = 604800
OFFLINE_EXPIRE_ADS_AFTER = 604800
ROOSTER_INTERVAL = 180
ROOSTER_UNHIBERNATE = ( Offline && Unhibernate )
ROOSTER_UNHIBERNATE_RANK = buf_cpuindex_avg
How do I debug condor_power_state? Does condor_power_state support the "systemctl hibernateâ method?
Thanks,
JK