Hi Greg,
Indeed, during the Docker restart process, the Docker cli is unavailable, giving the standard warning of ‘cannot connect to the docker daemon at unix:///var/run/docker.sock. is the
docker daemon running?’ however the cgroups and processes of the containers continue to run. Once the Docker daemon starts again, it re-establishes the ‘docker run’ processes.
For my use case it is just restarting the Docker daemon for small incremental patches to Docker (without having to drain the entire node)
Many thanks,
Tom
From:
HTCondor-users <htcondor-users-bounces@xxxxxxxxxxx> on behalf of Greg Thain via HTCondor-users <htcondor-users@xxxxxxxxxxx>
Date: Monday, 11 December 2023 at 23:02
To: htcondor-users@xxxxxxxxxxx <htcondor-users@xxxxxxxxxxx>
Cc: Greg Thain <gthain@xxxxxxxxxxx>
Subject: Re: [HTCondor-users] Condor and Docker Live-Restore
On 12/11/23 03:33, Thomas Birkett - STFC UKRI via HTCondor-users wrote:
Hi all,
Apologies for the follow up, does anyone have any experience with the aforementioned use case?
Hi Thomas:
This is very interesting, thanks for pointing it out. Presumably the 'docker run' process that starts the container exits in this case, when the docker daemon goes away. It would be very useful for HTCondor to be able to differentiate that the 'docker
run' has gone away, and is reconnectable -- do you know if that is the case? I assume that for your usage you just want to restart the docker daemon, and not any of the rest of HTCondor?
-greg
|