The condor_off command needs to go to a condor_master daemon, so
condor_off
svc-jaws@xxxxxxxxxx
or
condor_off -master
svc-jaws@xxxxxxxxxx
should work. The first commands turns off all of the daemons other than the condor_master.
The second command turns off all daemons, including the condor_master.
If that doesn't work, try
condor_off -debug svc-jaws@xxxxxxxxxx
The -debug option will help to show why the condor_off command is not working. -tj
From: HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Seung-Jin Sul <ssul@xxxxxxx>
Sent: Thursday, November 2, 2023 5:12 PM To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx> Subject: [HTCondor-users] Remove HTCondor worker node from a central Master Hi,
We are setting HTCondor using a glide-in way with SLURM. I was wondering if there is any way I can remove the HTCondor worker processes running on a SLURM compute node. I've been testing `condor_off -name <machine_name>` and `condor_off
-addr <IP:port>` but those are not successful so far.
For example, we have a worker node like below
```
$ condor_status -any
MyType TargetType Name Collector None My Pool - ln010.xxx@xxxxxxxxx Scheduler None svc-jaws@xxxxxxxxx DaemonMaster None svc-jaws@xxxxxxxxx Negotiator None svc-jaws@xxxxxxxxx Machine Job slot1@xxxxxxxxxx DaemonMaster None svc-jaws@xxxxxxxxxx Accounting none <none> ```
And then I would like to call a command from the central to
- terminate HTCondor services on n0040.yyy0
- clean up ` slot1@xxxxxxxxxx` from the Master's machine list
- terminate the SLURM job
Any help will be appreciated.
Thank you.
Best,
Seung
|