Mailing List Archives
Authenticated access
|
|
|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [HTCondor-users] How to set a worker node offline in HTCondor
- Date: Thu, 08 Apr 2021 18:01:01 +0000
- From: "Anderson, Stuart B." <sba@xxxxxxxxxxx>
- Subject: Re: [HTCondor-users] How to set a worker node offline in HTCondor
> On Apr 8, 2021, at 10:47 AM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
>
> In general the mechanism that we use to avoid cancelling a drain that was not started by defrag is to look for the DrainReason attribute of the p-slot.
>
> Draining can be cancelled by the Defrag daemon if there is no DrainReason, or if the DrainReason is "defrag".
>
> There should always be a DrainReason attribute if draining was started by an 8.9.11 or later condor_drain command, or by an 8.9.11 or later DEFRAG daemon.
OK, then there appears to be a bug in 8.9.11 (or I need to enable another condor setting). In particular, I ran version 8.9.11 "condor_drain machine-name" and DEFRAG restarted jobs after it was drained.
Note, I don't see a condor_drain option to specify DrainReason. If you agree the above is a bug then once it is fixed how should I specify DrainReason to indicate that a manual drain should be canceled by the Defrag daemon when it is done draining?
Thanks.
--
Stuart Anderson
sba@xxxxxxxxxxx