Docker Swarm Mode

In this section, we will continue exploring Swarm Mode commands for managing a cluster.

Manually adding nodes

You can choose to create new Swarm nodes, so Docker hosts, either way you prefer.

If Docker Machine is used, it will reach its limit very soon. You will have to be very patient while listing machines and wait for several seconds for Machine to get and print the information as a whole.

A method to add nodes manually is to use Machine with the generic driver; so, delegate host provisioning (Operating System installation, network and security groups configurations, and so on) to something else (such as Ansible), and later exploit Machine to install Docker in a proper manner. This is how it can be done:

  1. Manually configure the cloud environment (security groups, networks, and so on.)
  2. Provision Ubuntu hosts with a third party tool.
  3. Run Machine with the generic driver on these hosts with the only goal to properly install Docker.
  4. Manage hosts with the tool at part 2, or even others.

If you use Machine's generic driver, it will select the latest stable Docker binaries. While working on this book, in order to use Docker 1.12, we sometimes overcame this by giving Machine the option to get the latest unstable version of Docker with the --engine-install-url option:

docker-machine create -d DRIVER --engine-install-url mymachine

At the moment of reading this book, for a production Swarm (mode), 1.12 will be stable; so this trick will not be necessary anymore, unless you need to use some of the latest Docker features.


While planning a Swarm, some considerations regarding the number of managers have to be kept in mind, as we saw in Chapter 4, Creating a Production-Grade Swarm . The theory of HA suggests that the number of managers must be odd and equal or more than 3. To grant a quorum in high availability means that the majority of the nodes agree on the part of node that is leading the operations.

If there are two managers and one goes down and comes back, it's possible that both will be considered leaders. This causes a logical crash in the cluster organization, which is called a split brain.

The more managers you have, the higher is the resistance ratio to failures. Take a look at the following table.

Number of managers

Quorum (majority)

Maximum possible failures













Also, in the Swarm Mode, an ingress overlay network is created automatically and associated to the nodes as ingress traffic. Its purpose is to be used with containers:


You will want your containers to be associated to an internal overlay (VxLAN meshed) network to communicate with each other, rather than using public or other external networks. Thus, Swarm creates this for you and it is ready to use.

Workers number

You can add an arbitrary number of workers. This is the elastic part of the Swarm. It's totally fine to have 5, 15, 200, 2300, or 4700 running workers. This is the easiest part to handle; you can add and remove workers with no burdens, at any time, at any size.

Scripted nodes addition

The easiest way to add nodes, if you plan to not go a 100-nodes total, is to use basic scripting.

When executing docker swarm init, just copy-paste the lines printed as the output.

Then, create a certain bunch of workers with a loop:

for i in `seq 0 9`; do
docker-machine create -d amazonec2 --engine-install-url --amazonec2-instance-type "t2.large" swarm-

After this, it will only be necessary to go through the list of machines, ssh into them and join the nodes:

for machine in `docker-machine ls --format {{.Name}} | grep 
docker-machine ssh $machine sudo docker swarm join --token SWMTKN-

This script runs through the machines and for each, with a name starting with swarm-worker-, it will ssh into and join the node to the existing Swarm and to the leader manager, which is


See for further details or to download the one liners.


Belt is another variant for provisioning Docker Engines massively. It is basically a SSH wrapper on steroids and it requires you to prepare provider-specific images as well as provision templates before go massively. In this section, we'll learn how to do so.

You can compile Belt yourself by getting its source from Github.

# Set $GOPATH here
go get

Currently, Belt supports only the DigitalOcean driver. We can prepare our template for provisioning inside config.yml.

      image: "docker-1.12-rc4"
      region: nyc3
      ssh_key_fingerprint: "your SSH ID"
      ssh_user: root

Then, we can create hundreds of nodes with a couple of commands.

First, we create three manager hosts of 16 GB each, namely mg0, mg1, and mg2.

$ belt create 16gb mg[0:2]
  NAME      IPv4         MEMORY  REGION         IMAGE           STATUS
mg2  16384   nyc3    Ubuntu docker-1.12-rc4  active
  mg1    16384   nyc3    Ubuntu docker-1.12-rc4  active
mg0    16384   nyc3    Ubuntu docker-1.12-rc4  active

Then we can use the status command to wait for all nodes being active:

$ belt status --wait active=3
active      3   mg2, mg1, mg0

We'll do this again for 10 worker nodes:

$ belt create 512mb node[1:10]
$ belt status --wait active=13
active      3   node10, node9, node8, node7

Use Ansible

You can alternatively use Ansible (as I like, and it's becoming very popular) to make things more repeatable. We have created some Ansible modules to work with Machine and Swarm (Mode) directly; it is also compatible with Docker 1.12 ( They require Ansible 2.2+, the very first version of Ansible that is compatible with binary modules.

You will need to compile the modules (written in go) and then pass them to the ansible-playbook -M parameter.

git clone
cd ansible-swarm/library
go build docker-machine.go
    go build docker_swarm.go
cd ..

There are some example plays in playbooks. Ansible's plays syntax is so easy to understand that it is superfluous to even explain in detail.

I used this play to join 10 workers to the Swarm2k experiment:

    name: Join the Swarm2k project
  hosts: localhost
  connection: local
  gather_facts: False
name: Load shell variables
         shell: >
            eval $(docker-machine env "{{ machine_name }}")
            echo $DOCKER_TLS_VERIFY &&
            echo $DOCKER_HOST &&
            echo $DOCKER_CERT_PATH &&
            echo $DOCKER_MACHINE_NAME
         register: worker
name: Set facts
            whost: "{{ worker.stdout_lines[0] }}"
            wcert: "{{ worker.stdout_lines[1] }}"
name: Join a worker to Swarm2k
            role: "worker"
            operation: "join"
            join_url: ["tcp://"]
            secret: "d0cker_swarm_2k"
            docker_url: "{{ whost }}"
            tls_path: "{{ wcert }}"
         register: swarm_result
name: Print final msg
         debug: msg="{{ swarm_result.msg }}"

Basically, it invokes the docker_swarm module after loading some host facts from Machine:

  • The operation done is join
  • The role of the new node is worker
  • The new node joins tcp://, which was the leader manager at the moment of joining. This argument takes an array of managers, such as [tcp://,, tcp://]
  • It passes the password (secret)
  • It specifies some basic engine connection facts and the module will connect to the dockerurl using the certificates at tlspath.

After the docker_swarm.go is compiled in the library, joining the workers to the Swarm is as easy as:

for machine in `docker-machine ls --format {{.Name}} | grep 
ansible-playbook -M library --extra-vars "{machine_name: $machine}" 
