[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[HTCondor-users] Upgrading the cluster piecemeal instead of big bang



Hi,

we are still running 24.x on our Grid cluster, planning to upgrade to 25.0.

Is it feasible to upgrade the cluster bit by bit instead of doing a Big Bang upgrade?

My idea is to upgrade the worker nodes first. If there is no change in communication protocol, these should still be able to accept work from the schedds and communicate classads with the central manager.

The four schedds could be done one by one, so we do not have to drain the entire system.

Only the central manager is a single machine, but it can go down for a few minutes without affecting running jobs, so impact should be minimal (no new jobs will be negotiated during that upgrade).

Is this a bad idea? Is there a better order for doing the upgrade? How do people generally approach these upgrades?

I did look for some guidance in the documentation but found nothing about running in a mode where versions got mixed.