Are these logs from Windows?
Does the job keep running?
What is the prior working version?
We don't think the signal myself patch is relevant.
How do you expect the Windows jobs to be killed?
We take a look after you answer these questions.
...Tim
Hello!
With HTCondor v24.0.6, I am seeing unexpected behavior when attempting to deactivate a claim for a statically-provisioned, Windows Server 2019 execution point with ENABLE_STARTD_DAEMON_AD set to False. I have many platform-agnostic executables that specify kill_sig=SIGINT as part of their submission. When migrating to newer versions, removal of claims for Windows execution points stopped working despite the docs stating Windows does not consider kill_sig. Here is some example logging I see:
==> StarterLog.slot1 <==
(pid:1084) Got SIGTERM. Performing graceful shutdown.
(pid:1084) ShutdownGraceful all jobs.
(pid:1084) Send_Signal: ERROR Attempt to send signal 2 to pid 6064, but pid 6064 has no command socket # This is the job's PID
(pid:1084) Send (softkill) signal failed, retrying...
=> StartLog <==
slot1: State change: received VACATE_CLAIM command
slot1: Changing activity: Busy -> Retiring
slot1: State change: claim retirement ended/expired
slotl: Changing state and activity: Claimed/Retiring -> Preempting/Vacating
==> StarterLog.slot1 <==
(pid:1084) Send_Signal: ERROR Attempt to send signal 2 to pid 6064, but pid 6064 has no command socket
(pid:1084) Send (softkill) signal failed twice, hardkill will fire after timeout
I believe this could be related to PR #665, but I am not sure if it is a misconfiguration. Any help would be greatly appreciated!
Let me know if any other logging would be helpful with diagnosing this.
Thanks,
T. Rock
_______________________________________________ HTCondor-users mailing list To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a subject: Unsubscribe Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!IIBsI1IRXAdfM9p8k3lG3XMEPkqBy9TRiwTUp5-5_BvdE5kNAsS_kawJgt_u3rpb-IfLCBYUn4r_TRoQOqfr$ The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/
-- Tim Theisen (he, him, his) Release Manager Center for High Throughput Computing Department of Computer Sciences University of Wisconsin - Madison 4261 Computer Sciences and Statistics 1210 W Dayton St Madison, WI 53706-1685 +1 608 265 5736