[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] how to restart JobRouter ?



HI John, about this

On 20/03/2026 21:50, John M Knoeller wrote:


I would be curious to know what was in the MasterLog after you ran "condor_restart -daemon JOB_ROUTER", although it would be more helpful if you first added 

MASTER_DEBUG = $(MASTER_DEBUG) D_CAT D_COMMAND:1

it is as simple as: nothing is logged by master.  It looks like the command
fails early and makes no attempt to talk to anybody:

[root@vocms0137 condor]# condor_restart -daemon JOB_ROUTER
Can't find address for local JOB_ROUTER
Perhaps you need to query another pool.
[root@vocms0137 condor]


While condor_off JOB_ROUTER (which works) produced


03/23/26 10:37:42 (D_COMMAND) Calling Handler <SharedPortEndpoint::HandleListenerAccept> (0)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <SharedPortEndpoint::HandleListenerAccept> 0.000080s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.000458s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.005092s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.000568s
03/23/26 10:37:42 (D_COMMAND) Calling Handler <DaemonCommandProtocol::WaitForSocketData> (1)
03/23/26 10:37:42 (D_COMMAND) Calling HandleReq <admin_command_handler> (0) for command 467 (DAEMON_OFF) from condor@cms <[2001:1458:d00:61::100:435]:23589>
03/23/26 10:37:42 (D_ALWAYS) Handling DAEMON_OFF command for JOB_ROUTER
03/23/26 10:37:42 (D_ALWAYS) Sent SIGTERM to JOB_ROUTER (pid 1630814)
03/23/26 10:37:42 (D_COMMAND) Return from HandleReq <admin_command_handler> (handler: 0.000453s, sec: 0.007s, payload: 0.000s)
03/23/26 10:37:42 (D_COMMAND) Return from Handler <DaemonCommandProtocol::WaitForSocketData> 0.001237s
03/23/26 10:37:43 (D_COMMAND) DaemonCore: pid 1630814 exited with status 0, invoking reaper 1 <Daemons::DefaultReaper()>
03/23/26 10:37:43 (D_ALWAYS) The JOB_ROUTER (pid 1630814) exited with status 0
03/23/26 10:37:43 (D_COMMAND) DaemonCore: return from reaper for pid 1630814