|
Hmm. That seems like a bug in the condor_restart tool. It doesn't know about the JOB_ROUTER apparently.
-tj
From: Stefano Belforte <stefano.belforte@xxxxxxx>
Sent: Monday, March 23, 2026 5:41 AM
To: John M Knoeller <johnkn@xxxxxxxxxxx>; HTCondor-Users Mail List <htcondor-users@xxxxxxxxxxx>
Cc: stefano.belforte@xxxxxxx <stefano.belforte@xxxxxxx>
Subject: Re: [HTCondor-users] how to restart JobRouter ?
HI John, about this
On 20/03/2026 21:50, John M Knoeller wrote:
I would be curious to know what was in the MasterLog after you ran "condor_restart -daemon JOB_ROUTER", although it would be more helpful if you first added
MASTER_DEBUG = $(MASTER_DEBUG) D_CAT D_COMMAND:1
it is as simple as: nothing is logged by master. It looks like the command
fails early and makes no attempt to talk to anybody:
[root@vocms0137 condor]# condor_restart -daemon JOB_ROUTER
Can't find address for local JOB_ROUTER
Perhaps you need to query another pool.
[root@vocms0137 condor]
While condor_off JOB_ROUTER (which works) produced
03/23/26 10:37:42 (D_COMMAND) Calling Handler <SharedPortEndpoint::HandleListenerAccept> (0)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <SharedPortEndpoint::HandleListenerAccept> 0.000080s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.000458s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.005092s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.000568s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Calling HandleReq <admin_command_handler> (0) for command 467 (DAEMON_OFF) from condor@cms <[2001:1458:d00:61::100:435]:23589>
03/23/26 10:37:42 (D_ALWAYS) Handling DAEMON_OFF command for JOB_ROUTER
03/23/26 10:37:42 (D_ALWAYS) Sent SIGTERM to JOB_ROUTER (pid 1630814)
03/23/26 10:37:42 (D_COMMAND) Return from HandleReq <admin_command_handler> (handler: 0.000453s, sec: 0.007s, payload: 0.000s)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.001237s
03/23/26 10:37:43 (D_COMMAND) DaemonCore: pid 1630814 exited with status 0, invoking reaper 1 <Daemons::DefaultReaper()>
03/23/26 10:37:43 (D_ALWAYS) The JOB_ROUTER (pid 1630814) exited with status 0
03/23/26 10:37:43 (D_COMMAND) DaemonCore: return from reaper for pid 1630814
|