[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Windows Claim Deactivation



Are these logs from Windows?

Does the job keep running?

What is the prior working version?

We don't think the signal myself patch is relevant.

How do you expect the Windows jobs to be killed?

We take a look after you answer these questions.

...Tim

On 5/6/25 00:38, Thomas Rock wrote:

Hello!


With HTCondor
v24.0.6, I am seeing unexpected behavior when attempting to deactivate a claim for a statically-provisioned, Windows Server 2019 execution point with ENABLE_STARTD_DAEMON_AD set to False. I have many platform-agnostic executables that specify kill_sig=SIGINT as part of their submission. When migrating to newer versions, removal of claims for Windows execution points stopped working despite the docs stating Windows does not consider kill_sig. Here is some example logging I see:


==> StarterLog.slot1 <==

(pid:1084) Got SIGTERM. Performing graceful shutdown.

(pid:1084) ShutdownGraceful all jobs.

(pid:1084) Send_Signal: ERROR Attempt to send signal 2 to pid 6064, but pid 6064 has no command socket # This is the job's PID

(pid:1084) Send (softkill) signal failed, retrying...


=> StartLog <==

slot1: State change: received VACATE_CLAIM command

slot1: Changing activity: Busy -> Retiring

slot1: State change: claim retirement ended/expired

slotl: Changing state and activity: Claimed/Retiring -> Preempting/Vacating


==> StarterLog.slot1 <==

(pid:1084) Send_Signal: ERROR Attempt to send signal 2 to pid 6064, but pid 6064 has no command socket

(pid:1084) Send (softkill) signal failed twice, hardkill will fire after timeout 


I believe this could be related to PR #665, but I am not sure if it is a misconfiguration. Any help would be greatly appreciated! 


Let me know if any other logging would be helpful with diagnosing this. 


Thanks,

T. Rock


_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!IIBsI1IRXAdfM9p8k3lG3XMEPkqBy9TRiwTUp5-5_BvdE5kNAsS_kawJgt_u3rpb-IfLCBYUn4r_TRoQOqfr$ 

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/ 
-- 
Tim Theisen (he, him, his)
Release Manager
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736