In this section, we will continue exploring Swarm Mode commands for managing a cluster.
You can choose to create new Swarm nodes, so Docker hosts, either way you prefer.
If Docker Machine is used, it will reach its limit very soon. You will have to be very patient while listing machines and wait for several seconds for Machine to get and print the information as a whole.
A method to add nodes manually is to use Machine with the generic driver; so, delegate host provisioning (Operating System installation, network and security groups configurations, and so on) to something else (such as Ansible), and later exploit Machine to install Docker in a proper manner. This is how it can be done:
If you use Machine's generic driver, it will select the latest stable Docker binaries. While working on this book, in order to use Docker 1.12, we sometimes overcame this by giving Machine the option to get the latest unstable version of Docker with the --engine-install-url
option:
docker-machine create -d DRIVER --engine-install-url
https://test.docker.com mymachine
At the moment of reading this book, for a production Swarm (mode), 1.12 will be stable; so this trick will not be necessary anymore, unless you need to use some of the latest Docker features.
While planning a Swarm, some considerations regarding the number of managers have to be kept in mind, as we saw in Chapter 4, Creating a Production-Grade Swarm . The theory of HA suggests that the number of managers must be odd and equal or more than 3. To grant a quorum in high availability means that the majority of the nodes agree on the part of node that is leading the operations.
If there are two managers and one goes down and comes back, it's possible that both will be considered leaders. This causes a logical crash in the cluster organization, which is called a split brain.
The more managers you have, the higher is the resistance ratio to failures. Take a look at the following table.
Number of managers |
Quorum (majority) |
Maximum possible failures |
3 |
2 |
1 |
5 |
3 |
2 |
7 |
4 |
3 |
9 |
5 |
4 |
Also, in the Swarm Mode, an ingress overlay network is created automatically and associated to the nodes as ingress traffic. Its purpose is to be used with containers:
You will want your containers to be associated to an internal overlay (VxLAN meshed) network to communicate with each other, rather than using public or other external networks. Thus, Swarm creates this for you and it is ready to use.
You can add an arbitrary number of workers. This is the elastic part of the Swarm. It's totally fine to have 5, 15, 200, 2300, or 4700 running workers. This is the easiest part to handle; you can add and remove workers with no burdens, at any time, at any size.
The easiest way to add nodes, if you plan to not go a 100-nodes total, is to use basic scripting.
When executing docker swarm init
, just copy-paste the lines printed as the output.
Then, create a certain bunch of workers with a loop:
#!/bin/bash for i in `seq 0 9`; do docker-machine create -d amazonec2 --engine-install-url https://test.docker.com --amazonec2-instance-type "t2.large" swarm- worker-$i done
After this, it will only be necessary to go through the list of machines, ssh
into them and join
the nodes:
#!/bin/bash SWARMWORKER="swarm-worker-" for machine in `docker-machine ls --format {{.Name}} | grep $SWARMWORKER`; do docker-machine ssh $machine sudo docker swarm join --token SWMTKN- 1-5c3mlb7rqytm0nk795th0z0eocmcmt7i743ybsffad5e04yvxt- 9m54q8xx8m1wa1g68im8srcme 172.31.10.250:2377 done
This script runs through the machines and for each, with a name starting with swarm-worker-
, it will ssh
into and join the node to the existing Swarm and to the leader manager, which is 172.31.10.250
.
See https://github.com/swarm2k/swarm2k/tree/master/amazonec2 for further details or to download the one liners.
Belt is another variant for provisioning Docker Engines massively. It is basically a SSH wrapper on steroids and it requires you to prepare provider-specific images as well as provision templates before go
massively. In this section, we'll learn how to do so.
You can compile Belt yourself by getting its source from Github.
# Set $GOPATH here go get https://github.com/chanwit/belt
Currently, Belt supports only the DigitalOcean driver. We can prepare our template for provisioning inside config.yml
.
digitalocean: image: "docker-1.12-rc4" region: nyc3 ssh_key_fingerprint: "your SSH ID" ssh_user: root
Then, we can create hundreds of nodes with a couple of commands.
First, we create three manager hosts of 16 GB each, namely mg0
, mg1
, and mg2
.
$ belt create 16gb mg[0:2] NAME IPv4 MEMORY REGION IMAGE STATUS mg2 104.236.231.136 16384 nyc3 Ubuntu docker-1.12-rc4 active mg1 45.55.136.207 16384 nyc3 Ubuntu docker-1.12-rc4 active mg0 45.55.145.205 16384 nyc3 Ubuntu docker-1.12-rc4 active
Then we can use the status
command to wait for all nodes being active:
$ belt status --wait active=3 STATUS #NODES NAMES active 3 mg2, mg1, mg0
We'll do this again for 10 worker nodes:
$ belt create 512mb node[1:10] $ belt status --wait active=13
STATUS #NODES NAMES active 3 node10, node9, node8, node7
You can alternatively use Ansible (as I like, and it's becoming very popular) to make things more repeatable. We have created some Ansible modules to work with Machine and Swarm (Mode) directly; it is also compatible with Docker 1.12 (https://github.com/fsoppelsa/ansible-swarm). They require Ansible 2.2+, the very first version of Ansible that is compatible with binary modules.
You will need to compile the modules (written in go
) and then pass them to the ansible-playbook -M
parameter.
git clone https://github.com/fsoppelsa/ansible-swarm cd ansible-swarm/library go build docker-machine.go go build docker_swarm.go cd ..
There are some example plays in playbooks. Ansible's plays syntax is so easy to understand that it is superfluous to even explain in detail.
I used this play to join 10 workers to the Swarm2k experiment:
--- name: Join the Swarm2k project hosts: localhost connection: local gather_facts: False #mg0 104.236.18.183 #mg1 104.236.78.154 #mg2 104.236.87.10 tasks: name: Load shell variables shell: > eval $(docker-machine env "{{ machine_name }}") echo $DOCKER_TLS_VERIFY && echo $DOCKER_HOST && echo $DOCKER_CERT_PATH && echo $DOCKER_MACHINE_NAME register: worker name: Set facts set_fact: whost: "{{ worker.stdout_lines[0] }}" wcert: "{{ worker.stdout_lines[1] }}" name: Join a worker to Swarm2k docker_swarm: role: "worker" operation: "join" join_url: ["tcp://104.236.78.154:2377"] secret: "d0cker_swarm_2k" docker_url: "{{ whost }}" tls_path: "{{ wcert }}" register: swarm_result name: Print final msg debug: msg="{{ swarm_result.msg }}"
Basically, it invokes the docker_swarm
module after loading some host facts from Machine:
join
worker
tcp://104.236.78.154:2377
, which was the leader manager at the moment of joining. This argument takes an array of managers, such as [tcp://104.236.78.154:2377
, 104.236.18.183:2377
, tcp://104.236.87.10:2377
](secret)
dockerurl
using the certificates at tlspath
.After the docker_swarm.go
is compiled in the library, joining the workers to the Swarm is as easy as:
#!/bin/bash SWARMWORKER="swarm-worker-" for machine in `docker-machine ls --format {{.Name}} | grep $SWARMWORKER`; do ansible-playbook -M library --extra-vars "{machine_name: $machine}" playbook.yaml done