Cluster management

To illustrate cluster operations better, let's take a look at an example made up of three managers and ten workers. The first basic operation is listing nodes, with docker node ls command:

Cluster management

You can reference to the nodes by calling them either by their hostname (manager1) or by their ID (ctv03nq6cjmbkc4v1tc644fsi). The other columns in this list statement describes the properties of the cluster nodes.

  • STATUS is about the physical reachability of the node. If the node is up, it's Ready, otherwise it's Down.
  • AVAILABILITY is the node availability. A node state can either be Active (participating in the cluster operations), Pause (in standby, suspended, not accepting tasks), or Drain (waiting to be evacuated its tasks).
  • MANAGER STATUS is the current status of manager. If a node is not the manager, this field will be empty. If a node is manager, this field can either be Reachable (one of the managers present to guarantee high availability) or Leader (the host leading all operations).
    Cluster management

Nodes operations

The docker node command comes with a few possible options.

Nodes operations

As you see, you have all the possible commands for nodes management, but create. We are often asked when a create option will be added to the node command, but there is still no answer.

So far, create new nodes is a manual operation and the responsibility of cluster operators.

Demotion and promotion

Promotion is possible for worker nodes (transforming them into managers), while demotion is possible for manager nodes (transforming them into workers).

Always remember the table to guarantee high availability when managing the a lot of managers and workers (odd number, more than or equal to three).

Use the following syntax to promote worker0 and worker1 to managers:

docker node promote worker0
docker node promote worker1

There is nothing magical behind the curtain. Just, Swarm attempts to change the node role with on-the-fly instructions.

Demotion and promotion

Demote is the same (docker node demote worker1). But be careful to avoid accidentally demoting the node you're working from, otherwise you'll get locked out.

And finally, what happens if you try to demote a Leader manager? In this case, the Raft algorithm will start an election and a new leader will be selected among the active managers.

Tagging nodes

You may have noticed, in the preceding screenshot, that worker9 is in Drain availability. This means that the node is in the process of evacuating its tasks (if any), which will be rescheduled somewhere else on the cluster.

You can change node availability by updating its status, using docker node update command:

Tagging nodes

The availability option can be either activepause, or drain. Here we just restored worker9 to the active state.

  • The active state means that the node is running and ready to accept tasks
  • The pause state means that the node is running, but not accepting tasks
  • The drain state means that the node is running and not accepting tasks, but its currently draining its tasks that are getting rescheduled somewhere else

Another powerful update argument is about labels. There are --label-add and --label-rm that allow us to add labels to Swarm nodes, respectively.

Docker Swarm labels do not affect the Engine labels. It's possible to specify labels when starting the Docker Engine (dockerd [...] --label "staging" --label "dev" [...]). But Swarm has no power to edit or change them. Labels we see here only affect the Swarm behavior.

Labels are useful for categorizing nodes. When you start services, you can filter and decide where to physically spawn containers, using labels. For instance, if you want to dedicate a bunch of nodes with SSD to host MySQL, you can actually:

docker node update --label-add type=ssd --label-add type=mysql 
    worker1
docker node update --label-add type=ssd --label-add type=mysql 
    worker2
docker node update --label-add type=ssd --label-add type=mysql 
    worker3

Later, when you will start a service with the replica factor, say three, you'll be sure that it will start MySQL containers exactly on worker1, worker2, and worker3, if you filter by node.type:

docker service create --replicas 3 --constraint 'node.type == 
    mysql' --name mysql-service mysql:5.5.

Remove nodes

Node removal is a delicate operation. It's not just about excluding a node from the Swarm, but also about its role and the tasks it's running.

Remove workers

If a worker has the status as Down (for example, because it was physically shut down), then it's currently running nothing, so it can be safely removed:

docker node rm worker9

If a worker is in has the status as Ready, instead, then the previous command will raise an error, refusing to remove it. The node availability (Active, Pause or Drain) doesn't really matter, because it can still be potentially running tasks at the moment, or when resumed.

So, in this case an operator must manually drain the node. This means forcing it to release its tasks that will be rescheduled and moved to other workers:

docker node update --availability drain worker9

Once drained, the node can be shutdown and then removed when its status is Down.

Remove managers

Managers can't be removed. Before removing a manager node, it must be properly demoted to worker, eventually drained, and then shut down:

docker node demote manager3
docker node update --availability drain manager3
# Node shutdown
docker node rm manager3

When a manager has to be removed, another worker node should be identified as a new manager and promoted later, in order to maintain an odd number of managers.

Tip

Remove withdocker node rm --force

The --force flag removes a node, no matter what. This option must be used very carefully and it's usually the last resort in the presence of stuck nodes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset