Disaster recovery

If the swarm directory content is lost or corrupted on a manager, it's required to immediately remove that manager out of the cluster using the docker node remove nodeID command (and use --force in case it gets stuck temporarily).

The cluster administrator should not start a manager or join it to the cluster with an out-of-date swarm directory. Joining the cluster with the out-of-date swarm directory brings the cluster to an inconsistent state, as all managers will try to synchronize wrong data during the process.

After bringing down the manager with the corrupted directory, it's necessary to delete the /var/lib/docker/swarm/raft/wal and /var/lib/docker/swarm/raft/snap directories. Only after this step can the manager safely re-join the cluster.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset