Re: [HTCondor-users] How to set a worker node offline in HTCondor

Mailing List Archives Authenticated access	UW Madison Computer Sciences Department Computer Systems Lab

Date: Thu, 8 Apr 2021 17:47:15 +0000

From: John M Knoeller <johnkn@xxxxxxxxxxx>

Subject: Re: [HTCondor-users] How to set a worker node offline in HTCondor

In general the mechanism that we use to avoid cancelling a drain that was not started by defrag is to look for the DrainReason attribute of the p-slot.

Draining can be cancelled by the Defrag daemon if there is no DrainReason, or if the DrainReason is "defrag".

There should always be a DrainReason attribute if draining was started by an 8.9.11 or later condor_drain command, or by an 8.9.11 or later DEFRAG daemon.

-tj

From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Anderson, Stuart B. <sba@xxxxxxxxxxx>
Sent: Tuesday, April 6, 2021 6:28 PM
To: HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] How to set a worker node offline in HTCondor

I just had an 8.9.11 startd resume running jobs after condor_drain completed and StartLog recorded, "Processing cancel drain request from <10.14.0.25:42650>" where 10.14.0.25 is an 8.9.11 CM running condor_defrag.

> On Apr 1, 2021, at 8:26 AM, Anderson, Stuart B. <sba@xxxxxxxxxxx> wrote:
>
> Excellent!
>
>> On Apr 1, 2021, at 7:16 AM, John M Knoeller <johnkn@xxxxxxxxxxx> wrote:
>>
>> You will be glad to hear that as of 8.9.11 DEFRAG and the condor_drain command will now set a DrainReason attribute into the machine ClassAd. DEFRAG will check this attribute and only resume running jobs on machines that it drained.

--
Stuart Anderson
sba@xxxxxxxxxxx

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe
You can also unsubscribe by visiting
https://lists.cs.wisc.edu/mailman/listinfo/htcondor-users

The archives can be found at:
https://lists.cs.wisc.edu/archive/htcondor-users/

Mailing List Archives

Authenticated access

Re: [HTCondor-users] How to set a worker node offline in HTCondor