Job match record for two machines test and test1 was deleted. After that it waited for approx 20mins to match despite having available cores in the cluster. SymptomsÂare matching with bug mentioned in the subject. I remember earlier we updated to 9.0.17 from 8.8.15 and this issue never happened. We recently started updating to 24.0.3, a couple of times this issue has happened.Â
06/20/25 10:49:05 (pid:7473) job_transforms for 97792.0: 1 considered, 1 applied (SetTeam)
06/20/25 10:49:07 (pid:7473) Request was NOT accepted for claimÂslot1@xxxxxxxxxxxxxxxxÂ<xx.xx.80.20:9618?addrs=xx.xx.80.20-9618&alias=test.example.com&noUDP&sock=startd_95100_44b5> for testuser1 97792.0
06/20/25 10:49:07 (pid:7473) Match record (slot1@xxxxxxxxxxxxxxxxÂ<xx.xx.80.20:9618?addrs=xx.xx.80.20-9618&alias=test.example.com&noUDP&sock=startd_95100_44b5> for testuser1, 97792.0) deleted 06/20/25 11:11:25 (pid:7473) Starting add_shadow_birthdate(97792.0)
06/20/25 11:11:25 (pid:7473) Started shadow for job 97792.0 onÂslot1test2.example.com<xx.xx.xx.14:9618?addrs=xx.xx.xx.14-9618&alias=test2.example.com&noUDP&sock=startd_1592727_d2b8> for testuser1, (shadow pid = 3496536) 06/20/25 11:11:33 (pid:7473) Shadow pid 3496536 for job 97792.0 reports job exit reason 100.
06/20/25 11:11:33 (pid:7473) Match record (slot1@xxxxxxxxxxxxxxxxxÂ<xx.xx.xx.14:9618?addrs=xx.xx.xx.14-9618&alias=test2.example.com&noUDP&sock=startd_1592727_d2b8> for testuser1, 97792.0) deleted
Thanks & Regards,
Vikrant Aggarwal