To illustrate cluster operations better, let's take a look at an example made up of three managers and ten workers. The first basic operation is listing nodes, with docker node ls
command:
You can reference to the nodes by calling them either by their hostname (manager1) or by their ID (ctv03nq6cjmbkc4v1tc644fsi). The other columns in this list statement describes the properties of the cluster nodes.
The docker node
command comes with a few possible options.
As you see, you have all the possible commands for nodes management, but create
. We are often asked when a create option will be added to the node
command, but there is still no answer.
So far, create new nodes is a manual operation and the responsibility of cluster operators.
Promotion is possible for worker nodes (transforming them into managers), while demotion is possible for manager nodes (transforming them into workers).
Always remember the table to guarantee high availability when managing the a lot of managers and workers (odd number, more than or equal to three).
Use the following syntax to promote worker0
and worker1
to managers:
docker node promote worker0 docker node promote worker1
There is nothing magical behind the curtain. Just, Swarm attempts to change the node role with on-the-fly instructions.
Demote is the same (docker node demote worker1). But be careful to avoid accidentally demoting the node you're working from, otherwise you'll get locked out.
And finally, what happens if you try to demote a Leader manager? In this case, the Raft algorithm will start an election and a new leader will be selected among the active managers.
You may have noticed, in the preceding screenshot, that worker9 is in Drain availability. This means that the node is in the process of evacuating its tasks (if any), which will be rescheduled somewhere else on the cluster.
You can change node availability by updating its status, using docker node update
command:
The availability option can be either active
, pause
, or drain
. Here we just restored worker9 to the active state.
active
state means that the node is running and ready to accept taskspause
state means that the node is running, but not accepting tasksdrain
state means that the node is running and not accepting tasks, but its currently draining its tasks that are getting rescheduled somewhere elseAnother powerful update argument is about labels. There are --label-add
and --label-rm
that allow us to add labels to Swarm nodes, respectively.
Docker Swarm labels do not affect the Engine labels. It's possible to specify labels when starting the Docker Engine (dockerd [...] --label "staging" --label "dev" [...]
). But Swarm has no power to edit or change them. Labels we see here only affect the Swarm behavior.
Labels are useful for categorizing nodes. When you start services, you can filter and decide where to physically spawn containers, using labels. For instance, if you want to dedicate a bunch of nodes with SSD to host MySQL, you can actually:
docker node update --label-add type=ssd --label-add type=mysql worker1 docker node update --label-add type=ssd --label-add type=mysql worker2 docker node update --label-add type=ssd --label-add type=mysql worker3
Later, when you will start a service with the replica factor, say three, you'll be sure that it will start MySQL containers exactly on worker1, worker2, and worker3, if you filter by node.type
:
docker service create --replicas 3 --constraint 'node.type ==
mysql' --name mysql-service mysql:5.5.
Node removal is a delicate operation. It's not just about excluding a node from the Swarm, but also about its role and the tasks it's running.
If a worker has the status as Down (for example, because it was physically shut down), then it's currently running nothing, so it can be safely removed:
docker node rm worker9
If a worker is in has the status as Ready, instead, then the previous command will raise an error, refusing to remove it. The node availability (Active, Pause or Drain) doesn't really matter, because it can still be potentially running tasks at the moment, or when resumed.
So, in this case an operator must manually drain the node. This means forcing it to release its tasks that will be rescheduled and moved to other workers:
docker node update --availability drain worker9
Once drained, the node can be shutdown and then removed when its status is Down.
Managers can't be removed. Before removing a manager node, it must be properly demoted to worker, eventually drained, and then shut down:
docker node demote manager3 docker node update --availability drain manager3 # Node shutdown docker node rm manager3
When a manager has to be removed, another worker node should be identified as a new manager and promoted later, in order to maintain an odd number of managers.