Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] Error with condor_power
- Date: Wed, 11 Mar 2026 16:07:42 +0100
- From: Valerio Bellizzomi <valerio@xxxxxxxxxx>
- Subject: Re: [HTCondor-users] Error with condor_power
On Wed, 2026-03-11 at 11:40 +0100, Beyer, Christoph wrote:
> Hi,
>
> the problem most likely here is that once the machine powers down it sends a last classadd update overwriting the previous offline state. That is a known issue but not yet fixed to my knowledge (?)
>
> Try setting the shutdown script on the worker to:
>
> [root@batch1064 ~]# grep -i kill /etc/systemd/system/condor.service.d/01-condor-basic-overwrites.conf
> # send sigkill instead of sigterm
> KillSignal=SIGKILL
>
> (SIGKILL instead of SIGSTOP)
>
> (this is on RH like systems you will need to find the equivalent script on unbuntu-like systems ...)
>
> Not pretty but will preserver the offline state ...
>
> Best
> christoph
>
Hi Christoph,
that worked, I have set KillSignal=SIGKILL:
Matched 66.0 sel@xxxxxxxx <10.10.0.47:9618?addrs=10.10.0.47-
9618&alias=t450.sel&noUDP&sock=schedd_997_7003> preempting none
<10.10.0.49:9618?addrs=10.10.0.49-
9618&alias=master03.sel&noUDP&sock=startd_1031_c23b> slot2@xxxxxxxxxxxx
(offline)
Successfully matched with slot2@xxxxxxxxxxxx (offline)
Job 66.0 (delivered=1) matched to offline machine slot2@xxxxxxxxxxxxx
Got 1 startd ads matching ROOSTER_UNHIBERNATE=Offline && Unhibernate
Sending wakeup call to slot2@xxxxxxxxxxxxx
Thank you everyone, this is for me a big step forward.