Puppet and Docker

With the popularization of containerization technologies, new ways of approaching services provisioning have started to become popular; these technologies are based in the features of operating systems to start processes on the same kernel, but with isolated resources.

If we compare with virtualization technologies, virtual machines are generally started as full operating systems that have access to an emulated hardware stack, this emulated stack introduces some performance penalties, as some translations are needed so the operations in the virtual machine can reach the physical hardware. These penalties do not exist in containerization, because containers are directly executed on the host kernel and over the physical hardware. Isolation in containers happens at the level of operating system resources.

Before talking about the implications containers have for systems provisioning, let's see some examples of the isolation technologies the Linux kernel offers to containers:

  • Control Groups (more often known as cgroups) are used to create groups of processes that have access quotas for CPU and memory usage.
  • Capabilities are the set of privileges a traditional full-privileged root user would have. Each process has some bitmasks that indicate which one of these privileges it has. By default, a process started by root users has all privileges, and a process started by another user has no privileges. With capabilities, we have more fine-grained control; we can remove specific privileges from root processes or give privileges to processes started by normal users. In containers, it allows you to make the processes believe they are being run by the root, while at the end, they only have a certain set of privileges. This is a great tool for security, as it helps reduce the surface of possible attacks.
  • Namespaces are used to create independent collections of resources or elements in the state of the kernel; after defining them, processes can be attached to these namespaces so they can only access the resources in these namespaces. For example:
    • Network namespaces can have different sets of physical or virtual network interfaces, a process in a network namespace can only operate with the interfaces available there. A process in a namespace with just the local interface wouldn't have access to the network even if the host has.
    • Process namespaces create hierarchical views of processes, so processes started in a namespace can only see other descendant processes of the first process in this namespace.
    • There are also other namespaces for inter process communication, mount points, users, system identifiers (hostname and domain), and control groups.

Notice that all of these technologies can be (and indeed are) used by normal processes, but they are the ones that allow containers to be possible.

In general terms, to start a container these steps are followed:

  1. An image is obtained and copied if needed. An image can basically be a tarred filesystem and it should contain the executable file we want to run, and all its dependencies.
  2. A set of systems resources as cgroups, namespaces, network configurations, or mount points are set up for container use.
  3. A process is started by running an executable file located in the base image with the assigned resources created in the previous step.

Docker (https://www.docker.com/), probably the toolbox for containerization that has contributed more to popularize these technologies, offers a simple way to package a service in an image with all its dependencies and deploy it the same way in different infrastructure, from developer environments in laptops or the cloud, to production.

Docker creates containers with the principle of immutability: once they are created, they can only be used as they were intended at the moment of building them, and any change implies rebuilding a new container and replacing the running ones with new instances.

It also helps to think in services more than thinking on machines. Ideally, container hosts would be as simple as possible to just run containers, and any service deployed should be fully containerized with their dependencies in an immutable container.

These principles of simplicity in provisioning and immutable services seem opposed to the capacity to propagate changes even on complex deployments of configuration management tools such as Puppet. So, what space is left for these tools in deployments based on containers?

There are some areas where Puppet can still shine in these deployments:

  • Hosts provisioning: At the end, containers need to be run somewhere, when deploying container hosts, Puppet can be a good option to install the container executor. For example, there are some modules to install and configure Docker such as at https://github.com/garethr/garethr-docker. We could prepare a host to run dockers with just this line of puppet code: include 'docker'. This module can be also used to retrieve images and start containers with them:
    docker::image { 'ubuntu1604':
      image_tag => '16.04'
    }
    docker::run { 'helloworld':
      image   => 'ubuntu1604',
      command => '/bin/sh -c "while true; do echo hello world; sleep 1; done"',
    }
    
  • Container builds: Even when Docker features its own tool to build containers they have an open format, and other tools can be used to generate these artifacts. This is the case with Hashicorp's Packer (https://www.packer.io), a tool that automates the creation of images and containers, that between its list of supported provisioners includes Puppet in two flavors: server and Masterless. Server mode can be used to provision a container using an existing server, and Masterless can be used to provision using Puppet code directly by running Puppet agent.

    Note

    There are multiple formats for container images, the Docker one (implemented by packer) is one of them, but there are also efforts to create an open specification that is independent of the tool used to create or run them. This and other related specifications are maintained by the Open Container Initiative (https://www.opencontainers.org/), which includes the main actors in this topic.

  • Other system integrations: Depending on our deployment, we may require some specific configurations around containerized services such as cloud resources, DNS, networking, or remote volumes. As we have seen in this book, Puppet can be used to configure all these things.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset