|
Hi,
After upgrading to HTCondor 25.0.9, we are observing intermittent crashes of condor_startd on worker nodes.
The crashes appear to occur in Docker-related code, with stack traces including:
...
free()
DockerAPI::imageCacheUsed()
MachAttributes::compute_for_update()
This behavior was not seen prior to the upgrade.
Additionally, we are seeing duplicate --volume entries for the execute directory in generated Docker commands (present even without our wrapper), suggesting recent changes in Docker integration.
The crashes are intermittent but have been observed on multiple upgraded nodes.
Please let me know if you’d like full logs, core dumps, or additional debugging enabled.
Thanks, Arshad |