I (Fabrizio) was following a thread on Reddit (https://www.reddit.com/r/docker/comments/4zous1/monitoring_containers_under_112_swarm/) in August 2016, where users complained that the new Swarm Mode is harder to monitor.
If, for now, there are no official Swarm monitoring solutions, one of the most popular combinations of emerging technologies is: Google's cAdvisor to collect data, Grafana to show graphs, and Prometheus as the data model.
The team at Prometheus describes the product as:
Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.
Prometheus's main features are:
There is a great presentation on https://prometheus.io/docs/introduction/overview/ that we will not repeat here. The top feature of Prometheus is, in our opinion, the ease of installation and usage. Prometheus itself consists of just a single binary built from Go code, plus a configuration file.
Things are probably going to change very soon, so we just sketch a way to set up a monitoring system for Swarm, tried on Docker version 1.12.3.
First, we create a new overlay network to not interfere with the ingress
or spark
networks, called monitoring
:
aws-101$ docker network create --driver overlay monitoring
Then, we start a cAdvisor service in mode global
, meaning that a cAdvisor container will run on each Swarm node. We mount some system paths inside the container so that they can be accessed by cAdvisor:
aws-101$ docker service create --mode global --name cadvisor --network monitoring --mount type=bind,src=/var/lib/docker/,dst=/var/lib/docker --mount type=bind,src=/,dst=/rootfs --mount type=bind,src=/var/run,dst=/var/run --publish 8080 google/cadvisor
Then we use basi/prometheus-swarm
to set up Prometheus:
aws-101$ docker service create --name prometheus --network monitoring --replicas 1 --publish 9090:9090 prom/prometheus-swarm
And we add the node-exporter
service (again global
, must run on each node):
aws-101$ docker service create --mode global --name node-exporter --network monitoring --publish 9100 prom/node-exporter
Finally, we start Grafana with one replica:
aws-101$ docker service create --name grafana --network monitoring --publish 3000:3000 --replicas 1 -e "GF_SECURITY_ADMIN_PASSWORD=password" -e "PROMETHEUS_ENDPOINT=http://prometheus:9090" grafana/grafana
When Grafana is available, to get impressive graphs of the Swarm health, we login with these credentials on the node where Grafana runs, port 3000
:
"admin":"password"
As admins, we click on the Grafana logo, go to Data Sources, and add Prometheus
:
Some options will appear, but the mapping is already present, so it's sufficient to Save & Test:
Now we can go back to the Dashboard and click on Prometheus, so we will be presented the Grafana main panel:
Once again, we took advantage of what the open source community released, and glued different opinionated technologies with just some simple commands, to get the desired result. Monitoring Docker Swarm and its applications is a field of research that is completely open now, so we can expect an amazing evolution there too.