On 4/29/25 4:18 PM, Zach McGrew wrote:
Thanks for the help and feedback Todd (and Team), Here's some additional details that may be helpful: All of the systems (production and my test vms) are running Rocky Linux 9.5 on x86_64. AP and EPs have been upgraded to HTCondor 24.0.7. The production EPs are dual-cpu Xeon Gold 6130's (64 threads that get treated as 64 CPUs for HTCondor). The production AP is an older system, Xeon E5-2620. Not seeing any ECC errors or hardware check exceptions on EPs or AP.
Hi Zach:Thanks for the details -- we've pushed some fixes that will address some of these problems, but perhaps not all of them. These changes should be in the next release of HTCondor, or if you like, we could get you a pre-release to test with.
Sorry for the headaches, -greg