[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HTCondor-users] Issue HTCONDOR-769 observed in 24.0.3



Hello Vikrant,

There is not enough information to tell what is going on here. If you want us to look at this further, please provide (off list) the full SchedLog and NegotiatorLog around this time.

...Tim

On 6/24/25 08:53, Vikrant Aggarwal wrote:
Hello Experts,

Job match record for two machines test and test1 was deleted. After that it waited for approx 20mins to match despite having available cores in the cluster. Symptoms are matching with bug mentioned in the subject. I remember earlier we updated to 9.0.17 from 8.8.15 and this issue never happened. We recently started updating to 24.0.3, a couple of times this issue has happened. 

06/20/25 10:49:05 (pid:7473) job_transforms for 97792.0: 1 considered, 1 applied (SetTeam)
06/20/25 10:49:07 (pid:7473) Request was NOT accepted for claim slot1@xxxxxxxxxxxxxxxx <xx.xx.80.20:9618?addrs=xx.xx.80.20-9618&alias=test.example.com&noUDP&sock=startd_95100_44b5> for testuser1 97792.0
06/20/25 10:49:07 (pid:7473) Match record (slot1@xxxxxxxxxxxxxxxx <xx.xx.80.20:9618?addrs=xx.xx.80.20-9618&alias=test.example.com&noUDP&sock=startd_95100_44b5> for testuser1, 97792.0) deleted

06/20/25 10:52:13 (pid:7473) Match record (slot1@xxxxxxxxxxxxxxxxx <xx.xx.80.56:9618?addrs=xx.xx.80.56-9618&alias=test1.example.com&noUDP&sock=startd_12028_c939> for testuser1, 97792.0) deleted

06/20/25 11:11:25 (pid:7473) Starting add_shadow_birthdate(97792.0)
06/20/25 11:11:25 (pid:7473) Started shadow for job 97792.0 on slot1test2.example.com<xx.xx.xx.14:9618?addrs=xx.xx.xx.14-9618&alias=test2.example.com&noUDP&sock=startd_1592727_d2b8> for testuser1, (shadow pid = 3496536)
06/20/25 11:11:33 (pid:7473) Shadow pid 3496536 for job 97792.0 reports job exit reason 100.
06/20/25 11:11:33 (pid:7473) Match record (slot1@xxxxxxxxxxxxxxxxx <xx.xx.xx.14:9618?addrs=xx.xx.xx.14-9618&alias=test2.example.com&noUDP&sock=startd_1592727_d2b8> for testuser1, 97792.0) deleted


Thanks & Regards,
Vikrant Aggarwal

_______________________________________________
HTCondor-users mailing list
To unsubscribe, send a message to htcondor-users-request@xxxxxxxxxxx with a
subject: Unsubscribe

Join us in June at Throughput Computing 25: https://urldefense.com/v3/__https://osg-htc.org/htc25__;!!Mak6IKo!Nkr6JixxA81Pe0FM2n2UhaIy6cIPPbLgNNnntn-gBwkQ7v5lSl-1OYvEnnLmT2R8Hqk_5RxOMsG0AYzVcDk$ 

The archives can be found at: https://www-auth.cs.wisc.edu/lists/htcondor-users/ 
-- 
Tim Theisen (he, him, his)
Release Manager
Center for High Throughput Computing
Department of Computer Sciences
University of Wisconsin - Madison
4261 Computer Sciences and Statistics
1210 W Dayton St
Madison, WI 53706-1685
+1 608 265 5736