Chapter 6: Meet Buildah – Building Containers from Scratch

The great appeal of containers is that they allow us to package applications inside immutable images that can be deployed on systems and run seamlessly. In this chapter, we will learn how to create images using different techniques and tools. This includes learning how an image build works under the hood and how to create images from scratch.

In this chapter, we're going to cover the following main topics:

  • Basic image building with Podman
  • Meet Buildah, Podman's companion tool for builds
  • Preparing our environment
  • Choosing our build strategy
  • Building images from scratch
  • Building images from a Dockerfile

Technical requirements

Before proceeding with this chapter, a machine with a working Podman installation is required. As stated in Chapter 3, Running the First Container, all the examples in the book are executed on a Fedora 34 system or later but can be reproduced on the reader's choice of OS.

A good understanding of the topics covered in Chapter 4, Managing Running Containers, is useful to easily grasp concepts regarding Open Container Initiative (OCI) images.

Basic image building with Podman

A container's OCI image is a set of immutable layers stacked together with a copy-on-write logic. When an image is built, all the layers are created in a precise order and then pushed to the container registry, which stores our layers as tar-based archives along with additional image metadata.

As we learned in the OCI Images section of Chapter 2, Comparing Podman and Docker, these manifests are necessary to correctly reassemble the image layers (the image manifest and the image index) and to pass runtime configurations to the container engine (the image configuration).

Before proceeding with the basic examples of image builds with Podman, we need to understand how image builds generally work to grasp the simple but very smart key concepts that lay beneath.

Builds under the hood

Container images can be built in different ways, but the most common approach, probably one of the keys to the huge success of containers, is based on Dockerfiles.

A Dockerfile, as the name suggests, is the main configuration file for Docker builds and is a plain list of actions to be executed in the build process.

Over time, Dockerfiles became a standard in OCI image builds and today are adopted in many use cases.

Important Note

To standardize and remove the association with the brand, Containerfiles were also introduced; they have the very same syntax as Dockerfiles and are supported natively by Podman. In this book, we will use the two terms Dockerfile and Containerfile interchangeably.

We will learn in detail Dockerfiles' syntax in the next subsection. For now, let's just focus on a concept – a Dockerfile is a set of build instructions that the build tool executes sequentially. Let's look at this example:

FROM docker.io/library/fedora

RUN dnf install -y httpd && dnf clean all -y

COPY index.html /var/www/html

CMD ["/usr/sbin/httpd", "-DFOREGROUND"]

This basic example of a Dockerfile holds only four instructions:

  • The FROM instruction, which defines the base image that will be used
  • The RUN instruction, which executes some actions during the build (in this example, installing packages with the dnf package manager)
  • The COPY instruction, which copies files or directories from the build working directory to the image
  • The CMD instruction, which defines the command to be executed when the container starts

When the RUN and the COPY actions of the example are executed, new layers that hold the changes are cached in intermediate layers, represented by temporary containers. This is a native feature in Docker that has the advantage of reusing cached layers on further builds when no changes are requested on a specific layer. All the intermediate containers will produce read-only layers merged by the overlay graph driver.

Users don't need to manually manage the cached layers – the engine automatically implements the necessary actions by creating the temporary containers, executing the actions defined by the Dockerfile instructions, and then committing. By repeating the same logic for all the necessary instructions, Podman creates a new image with additional layers on top of the ones of the base image.

It is possible to squash the image layers into a single one to avoid a negative impact on the overlay's performances. Podman offers the same features and lets you choose between caching intermediate layers or not.

Not all Dockerfile instructions change the filesystem, and only the ones that do it will create a new image layer; all the other instructions, such as the CMD instruction in the preceding example, produce an empty layer with metadata only and no changes in the overlay filesystem.

In general, the only instructions that create new layers by effectively changing the filesystem are the RUN, COPY, and ADD instructions. All the other instructions in a Dockerfile or Containerfile just create temporary intermediate images and do not impact the final image filesystem.

This is also a good reason to keep the number of Dockerfile RUN, COPY, and ADD instructions limited, since having images cluttered with too many layers is not a good pattern and impacts the graph driver performances.

We can inspect an image's history and the actions that have been applied to every layer. The following example shows an excerpt of the output from the podman inspect command, with the target image being a potential one created from the previous sample Dockerfile:

$ podman inspect myhttpd

[...omitted output]

        "History": [

            {

                "created": "2021-04-01T17:59:37.09884046Z",

                "created_by": "/bin/sh -c #(nop)  LABEL maintainer=Clement Verna [email protected]",

                "empty_layer": true

            },

            {

                "created": "2021-04-01T18:00:19.741002882Z",

                "created_by": "/bin/sh -c #(nop)  ENV DISTTAG=f34container FGC=f34 FBR=f34",

                "empty_layer": true

            },

            {

                "created": "2021-07-23T11:16:05.060688497Z",

                "created_by": "/bin/sh -c #(nop) ADD file:85d7 f2d8e4f31d81b27b8e18dfc5687b5dabfaafdb2408a3059e120e4c15307b in / "

            },

            {

                "created": "2021-07-23T11:16:05.833115975Z",

                "created_by": "/bin/sh -c #(nop)  CMD ["/bin/bash"]",

                "empty_layer": true

            },

            {

                "created": "2021-10-24T21:27:18.783034844Z",

                "created_by": "/bin/sh -c dnf install -y httpd u0026u0026 dnf clean all -y  ",

                "comment": "FROM docker.io/library/fedora:latest"

            },

            {

                "created": "2021-10-24T21:27:21.095937071Z",

                "created_by": "/bin/sh -c #(nop) COPY file: 78c6e1dcd6f819581b54094fd38a3fd8f170a2cb768101e533c964e 04aacab2e in /var/www/html "

            },

            {

                "created": "2021-10-24T21:27:21.182063974Z",

                "created_by": "/bin/sh -c #(nop) CMD ["/usr/sbin/httpd", "-DFOREGROUND"]",

                "empty_layer": true

            }

        ]

[...omitted output]

Looking at the last three items of the image history, we can note the exact instructions defined in the Dockerfile, including the last CMD instruction that does not create any new layer but instead metadata that will persist in the image config.

With this deeper awareness of the image build logic in mind, let's now explore the most common Dockerfile instructions before proceeding with the Podman build examples.

Dockerfile and Containerfile instructions

As stated before, Dockerfiles and Containerfiles share the same syntax. The instruction in those files should be seen as (and truly are) commands passed to the container engine or build tool. This subsection provides an overview of the most frequently used instructions.

All Dockerfile/Containerfile instructions follow the same pattern:

# Comment

INSTRUCTION arguments

The following list provides a non-exhaustive list of the most common instructions:

  • FROM: This is the first instruction of a build stage and defines the base image used as the starting point of the build. It follows the FROM <image>[:<tag>] syntax to identify the correct image to use.
  • RUN: This instruction tells the engine to execute the commands passed as arguments inside a temporary container. It follows the RUN <command> syntax. The invoked binary or script must exist in the base image or a previous layer.

As stated before, the RUN instruction creates a new image layer; therefore, it is a frequent practice to concatenate commands into the same RUN instruction to avoid cluttering too many layers.

This example compacts three commands inside the same RUN instruction:

RUN dnf upgrade -y &&

dnf install httpd -y &&

dnf clean all -y

  • COPY: This instruction copies files and folders from the build working directory to the build sandbox. Copied resources are persisted in the final image. It follows the COPY <src>… <dest> syntax, and it has a very useful option that lets us define the destination user and group instead of manually changing ownership later – --chown=<user>:<group>.
  • ADD: This instruction copies files, folders, and remote URLs to the build destination target. It follows the ADD <src>… <dest> syntax. This instruction also supports the automatic extraction of tar files from a source directly into the target path.
  • ENTRYPOINT: The executed command in the container. It receives arguments from the command line (in the form of podman run <image> <arguments>) or from the CMD instruction.

An ENTRYPOINT image cannot be overridden by command-line arguments. The supported forms are the following:

  • ENTRYPOINT ["command", "param1", "paramN"] (also known as the exec form)
  • ENTRYPOINT command param1 paramN (the shell form)

If not set, its default value is bash -c. When set to the default value, commands are passed as an argument to the bash process. For example, if a ps aux command is passed as an argument at runtime or in a CMD instruction, the container will execute bash -c "ps aux".

A frequent practice is to replace the default ENTRYPOINT command with a custom script that behaves in the same way and offers more granular control of the runtime execution.

  • CMD: The default argument(s) passed to the ENTRYPOINT instruction. It can be a full command or a set of plain arguments to be passed to a custom script or binary set as ENTRYPOINT. It supported forms are the following:
    • CMD ["command", "param1", "paramN"] (the exec form)
    • CMD ["param1, "paramN"] (the parameter form, used to pass arguments to a custom ENTRYPOINT)
    • CMD command param1 paramN (the shell form)
  • LABEL: This instruction is used to apply custom labels to the image. Labels are used as metadata at build time or runtime. It follows the LABEL <key1>=<value1> … <keyN>=<valueN> syntax.
  • EXPOSE: This sets metadata about listening ports exposed by the processes running in the container. It supports the EXPOSE <port>/<protocol> format.
  • ENV: This configures environment variables that will be available to the next build commands and at runtime when the container is executed. This instruction supports the ENV <key1>=<value1>… <keyN>=<valueN> format.

Environment variables can also be set inside a RUN instruction with a scope limited to the instruction itself.

  • VOLUME: This sets a volume that will be created at runtime during container execution. The volume will be automatically mapped by Podman inside the default volume storage directory. The supported formats are the following:
    • VOLUME ["/path/to/dir"]
    • VOLUME /path/to/dir

See also the Attaching host storage to a container section in Chapter 5, Implementing Storage for the Containers' Data, for more details about volumes.

  • USER: This instruction defines the username and user group for the next RUN, CMD, and ENTRYPOINT instructions. The GID value is not mandatory.

The supported formats are the following:

  • USER <username>:[<groupname>]
  • USER <UID>:[<GID>]
  • WORKDIR: This sets the working directory during the build process. This value is retained during container execution. It supports the WORKDIR /path/to/workdir format.
  • ONBUILD: This instruction defines a trigger command to be executed once an image build has been completed. In this way, the image can be used as a parent for a new build by calling it with the FROM instruction. Its purpose is to allow the execution of some final command on a child container image.

The supported formats are the following:

  • ONBUILD ADD . /opt/app
  • ONBUILD RUN /opt/bin/custom-build /opt/app/src

Now that we have learned the most common instructions, let's dive into our first build examples with Podman.

Running builds with Podman

Good news – Podman provides the same build commands and syntax as Docker. If you are switching from Docker, there will be no learning curve to start building your images with it. Under the hood, there is a notable advantage in choosing Podman as a build tool – Podman can build containers in rootless mode, using a fork/exec model.

This is a step forward compared to Docker builds, where communication with the daemon listening on the Unix socket is necessary to run the build.

Let's start by running a simple build based on the httpd Dockerfile illustrated in the first Builds under the hood subsection. We will use the following podman build command:

$ podman build -t myhttpd .

STEP 1/4: FROM docker.io/library/fedora

STEP 2/4: RUN dnf install -y httpd && dnf clean all -y  

[...omitted output]

--> 50a981094eb

STEP 3/4: COPY index.html /var/www/html

--> 73f8702c5e0

STEP 4/4: CMD ["/usr/sbin/httpd", "-DFOREGROUND"]

COMMIT myhttpd

--> e773bfee6f2

Successfully tagged localhost/myhttpd:latest e773bfee6f289012b37285a9e559bc44962de3aeed001455231b5a8f2721b8f9

In the preceding example, the output of the dnf install command was omitted for the sake of clarity and space.

The command runs the instructions sequentially and persists the intermediate layers until the final image is committed and tagged. The build steps are numbered (1/4 to 4/4) and some of them (RUN and COPY here) produce non-empty layers, forming part of the image lowerDirs.

The first FROM instruction defines the base image, which is pulled automatically if not present in the host.

The second instruction is RUN, which executes the dnf command to install the httpd package and clean up the system upon completion. Under the hood, this line is executed as "bash –c 'dnf install -y httpd && dnf clean all –y'".

The third COPY instruction simply copies the index.html file in the default httpd document root.

Finally, the fourth step defines the default container CMD instruction. Since no ENTRYPOINT instructions were set, this will translate into the following command:

"bash -c '/usr/sbin/httpd -DFOREGROUND'"

The next example is a custom Dockerfile/Containerfile where a custom web server is built:

FROM docker.io/library/fedora

# Install required packages

RUN set -euo pipefail;

    dnf upgrade -y;

    dnf install httpd -y;

    dnf clean all -y;

    rm -rf /var/cache/dnf/*

# Custom webserver configs for rootless execution

RUN set -euo pipefail;

    sed -i 's|Listen 80|Listen 8080|'

           /etc/httpd/conf/httpd.conf;

    sed -i 's|ErrorLog "logs/error_log"|ErrorLog /dev/stderr|'

           /etc/httpd/conf/httpd.conf;

    sed -i 's|CustomLog "logs/access_log" combined|CustomLog /dev/stdout combined|'

           /etc/httpd/conf/httpd.conf;

    chown 1001 /var/run/httpd

                    

# Copy web content

COPY index.html /var/www/html

# Define content volume

VOLUME /var/www/html

# Copy container entrypoint.sh script

COPY entrypoint.sh /entrypoint.sh

# Declare exposed ports

EXPOSE 8080

# Declare default user

USER 1001

ENTRYPOINT ["/entrypoint.sh"]

CMD ["httpd"]

This example was designed for the purpose of this book to illustrate some peculiar elements:

  • Packages installed with a package manager should be kept at a minimum. After installing the httpd package, necessary to run the web server, the cache is cleaned to save layer space.
  • Multiple commands can be grouped together in a single RUN instruction. However, we don't want to continue the build if a single command fails. To provide a failsafe shell execution, the set -euo pipefail command was prepended. Also, to improve readability, the single commands were split into more lines using the character, which can work as a line break or escape character.
  • To avoid running the isolated processes as the root user, a series of workarounds were implemented in order to have the httpd process running as the generic 1001 user. Those workarounds included updating files permissions and group ownership on specific directories that are expected to be accessed by non-root users. This is a security best practice that reduces the attack surface of the container.
  • A common pattern in containers is the redirections of application logs to the container's stdout and stderr. The common httpd log streams have been modified for this purpose using regular expressions against the /etc/httpd/conf/httpd.conf file.
  • The web server ports are declared as exposed with the EXPOSE instruction.
  • The CMD instruction is a simple httpd command without any other argument. This was done to illustrate how the ENTRYPOINT can interact with the CMD arguments.

The container ENTRYPOINT instruction is modified with a custom script that brings more flexibility to the way the CMD instruction is managed. The entrypoint.sh file tests whether the container is executed as root and checks the first CMD argument – if the argument is httpd, it executes the httpd -DFOREGROUND command; otherwise, it lets you execute any other command (a shell, for example). The following code is the content of the entrypoint.sh script:

#!/bin/sh

set -euo pipefail

if [ $UID != 0 ]; then

    echo "Running as user $UID"

fi

if [ "$1" == "httpd" ]; then

    echo "Starting custom httpd server"

    exec $1 -DFOREGROUND

else

    echo "Starting container with custom arguments"

    exec "$@"

fi

Let's now build the image with the podman build command:

$ podman build –t myhttpd .

The newly built image will be available in the local host cache:

$ podman images | grep myhttpd

localhost/myhttpd latest 6dc90348520c 2 minutes ago   248 MB

After building, we can tag the image with the target registry name. The following example tags the image applying the v1.0 tag and the latest tag:

$ podman tag localhost/myhttpd quay.io/<username>/myhttpd:v1.0

After tagging, the image will be ready to be pushed to the remote registry. We will cover the interaction with registries in greater detail in Chapter 9, Pushing Images to a Container Registry.

The example image will be composed of five layers, including the base Fedora image layer. We can verify the number of layers by running the podman inspect command against the new image:

$ podman inspect myhttpd --format '{{ .RootFS.Layers }}'

[sha256:b6d0e02fe431db7d64d996f3dbf903153152a8f8b857cb4829 ab3c4a3e484a72

sha256:f41274a78d9917b0412d99c8b698b0094aa0de74ec8995c88e5 dbf1131494912

sha256:e57dde895085c50ea57db021bffce776ee33253b4b8cb0fe909b bbac45af0e8c

sha256:9989ee85603f534e7648c74c75aaca5981186b787d26e0cae0bc 7ee9eb54d40d

sha256:ca402716d23bd39f52d040a39d3aee242bf235f626258958b889 b40cdec88b43]

It is possible to squash the current build layers into a single layer using the --layers=false option. The resulting image will have only two layers – the base Fedora layer and the squashed one. The following example rebuilds the image without caching the intermediate layers:

$ podman build -t myhttpd --layers=false .

Let's inspect the output image again:

$ podman inspect myhttpd --format '{{ .RootFS.Layers }}'

[sha256:b6d0e02fe431db7d64d996f3dbf903153152a8f8b857cb 4829ab3c4a3e484a72

sha256:6c279ab14837b30af9360bf337c7f9b967676a61831eee9 1012fa67083f5dcf1]

This time, the final image has the two expected layers only.

Reducing the number of layers can be useful to keep the image minimal in terms of overlays. The downside of this approach is that we will have to rebuild the whole image for every configuration change without taking advantage of cached layers.

In terms of isolation, Podman can safely build images in rootless mode. Indeed, this is considered a value since there should be no need to run builds with a privileged user such as root. If rootful builds are necessary, they are fully functional and supported. The following example runs a build as the root user:

# podman build -t myhttpd .

The resulting image will be available only in the system image cache and its layers stored under /var/lib/containers/storage/.

The flexible nature of Podman builds is strongly related to its companion tool, Buildah, a specialized tool to build OCI images that provides greater flexibility in builds. In the next section, we will describe Buildah's features and how it manages image builds.

Meet Buildah, Podman's companion tool for builds

Podman does an excellent job in plain builds with Dockerfiles/Containerfiles and helps teams to preserve their previously implemented build pipelines without the need for new investments.

However, when it comes to more specialized build tasks, or when users need more control on the build workflow, with the option of including scripting logic, the Dockerfile/Containerfile approach shows its limitations. Communities struggled to find alternative building approaches that can overcome the rigid, workflow-based logic of Dockerfiles/Containerfiles.

The same community that develops Podman brought to life the Buildah (pronounced build-ah) project, a tool to manage OCI builds with support for multiple building strategies. Images created with Buildah are fully portable and compatible with Docker, and all engines are compliant with the OCI image and runtime specs.

Buildah is an open source project released under the Apache 2.0 license. Sources are available on GitHub at the following URL: https://github.com/containers/buildah.

Buildah is complementary to Podman, which borrows its build logic by vendoring its libraries to implement basic build functionalities against Dockerfiles and Containerfiles. The final Podman binary, which is compiled in Go as a statically linked single file, embeds Buildah packages to manage the build steps.

Buildah uses the containers/image project (https://github.com/containers/image) to manage an image's life cycle and its interaction with registries, and the containers/storage project (https://github.com/containers/storage) to manage images and containers' filesystem layers.

The advanced build strategy of Buildah is based on the parallel support for traditional Dockerfile/Containerfile-based builds, and for builds driven by native Buildah commands that replicate the Dockerfile instructions.

By replicating Dockerfile instructions in standard commands, Buildah becomes a scriptable tool that can be interpolated with custom logic and native shell constructs such as conditionals, loops, or environment variables. For example, the RUN instruction in a Dockerfile can be replaced with a buildah run command.

If teams need to preserve the build logic implemented in previous Dockerfiles, Buildah offers the buildah build (or its alias, buildah bud) command, which builds the image reading from the provided Dockerfile/Containerfile.

Buildah can smoothly run in rootless mode to build images; this is a valuable, highly demanded feature from a security point of view. No Unix sockets are necessary to run a build. At the beginning of this chapter, we explained how builds are always based on containers; Buildah is not exempt from this behavior, and all its builds are executed inside working containers, starting on top of the base image.

The following list provides a non-exhaustive description of the most frequently used commands in Buildah:

  • buildah from: Initializes a new working container on top of a base image. It accepts the buildah from [options] <image> syntax. An example of this command is $ buildah from fedora.
  • buildah run: This is equivalent to the RUN instruction of a Dockerfile; it runs a command inside a working container. This command accepts the buildah run [options] [--] <container> <command> syntax. The -- (double dash) option is necessary to separate potential options from the effective container command. An example of this command is buildah run <containerID> -- dnf install -y nginx.
  • buildah config: This command configures image metadata. It accepts the buildah config [options] <container> format. The options available for this command are associated with the various Dockerfile instructions that do not modify filesystem layers but set some container metadata – for instance, the setup of the entrypoint container. An example of this command is buildah config --entrypoint/entrypoint.sh <containerID>.
  • buildah add: This is equivalent to the ADD instruction of the Dockerfile; it adds files, directories, and even URLs to the container. It supports the buildah add [options] <container> <src> [[src …] <dst> syntax and allows you to copy multiple files in one single command. An example of this command is buildah add <containerID> index.php /var/www.html.
  • buildah copy: This is the same as the Dockerfile COPY instruction; it adds files, URLs, and directories to the container. It supports the buildah copy [options] <container> <src> [[src …] <dst> syntax. An example of this command is buildah copy <containerID> entrypoint.sh /.
  • buildah commit: This commits a final image out of a working container. This command is usually the last executed one. It supports the buildah copy [options] <container> <image_name> syntax. The container image created from this command can be later tagged and pushed to a registry. An example of this command is buildah commit <containerID> <myhttpd>.
  • buildah build: The equivalent command of the classic Podman build. This command takes Dockerfiles or Containerfiles as arguments, along with the build directory path. It accepts the buildah build [options] [context] syntax and the buildah bud command alias. An example of this command is buildah build –t <imageName> ..
  • buildah containers: This lists the active working container involved in Buildah builds, along with the base image used as starting point. Equivalent commands are buildah ls and buildah ps. The supported syntax is buildah containers [options]. An example of this command is buildah containers.
  • buildah rm: This is used to remove working containers. The buildah delete command is equivalent. The supported syntax is buildah rm <container>. This command has only one option, the –all, -a option, to remove all the working containers. An example of this command is buildah rm <containerID>.
  • buildah mount: This command can be used to mount a working container root filesystem. The accepted syntax is buildah mount [containerID … ]. When no argument is passed, the command only shows the currently mounted containers. An example of this command is buildahmount<containerID>.
  • buildah images: This lists all the available images in the local host cache. The accepted syntax is buildah images [options] [image]. Custom output formats such as JSON are available. An example of this command is buildah images --json.
  • buildah tag: This applies a custom name and tags to an image in the local store. The syntax follows the buildah tag <name> <new-name> format. An example of this command is buildah tag myapp quay.io/packt/myapp:latest.
  • buildah push: This pushes a local image to a remote private or public register, or local directories in Docker or OCI format. This command offers greater flexibility when compared to equivalents in Podman or Docker. The command syntax is buildah push [options] <image> [destination]. Examples of this command include buildah push quay.io/packt/myapp:latest, buildah push <imageID> docker://<URL>/repository:tag, and buildah push <imageID> oci:</path/to/dir>:image:tag.
  • buildah pull: This pulls an image from a registry, an OCI archive, or directory. Syntax includes buildah pull [options] <image>. Examples of this command include buildah pull <imageName>, buildah pull docker://<URL>/repository:tag, and buildah pull dir:</path/to/dir>.

All the commands described previously have their corresponding man page, with the man buildah-<command> pattern. For example, to read documentation details about the buildah run command, just type man buildah-run on the terminal.

The next example shows basic Buildah capabilities. A Fedora base image is customized to run an httpd process:

$ container=$(buildah from fedora)

$ buildah run $container -- dnf install -y httpd; dnf clean all

$ buildah config --cmd "httpd -DFOREGROUND" $container

$ buildah config --port 80 $container

$ buildah commit $container myhttpd

$ buildah tag myhttpd registry.example.com/myhttpd:v0.0.1

The preceding commands will produce an OCI-compliant, portable image with the same features of an image built from a Dockerfile, all in a few lines that can be included in a simple script.

We will now focus on the first command:

$ container=$(buildah from fedora)

The buildah from command pulls a Fedora image from one of the allowed registries and spins up a working container from it, returning the container name. Instead of simply having it printed on standard output, we will capture the name with shell expansion syntax. From now on, we can pass the $container variable, which holds the name of the generated container, to the subsequent commands. Therefore, the build commands will be executed inside this working container. This is quite a common pattern and is especially useful to automate Buildah commands in scripts.

Important Note

There is a subtle difference between the concept of container in Buildah and Podman. Both adopt the same technology to create containers, but Buildah containers are short-lived entities that are created to be modified and committed, while Podman containers are supposed to run long-living workloads.

The flexible and embeddable nature of this approach is remarkable – Buildah commands can be included anywhere, and users can choose between a fully automated build process and a more interactive one.

For example, Buildah can be easily integrated with Ansible, the open source automation engine, to provide automated builds using native connection plugins that enable communication with working containers.

You can choose to include Buildah inside a CI pipeline (such as Jenkins, Tekton, or GitLab CI/CD) to gain full control of the build and integration tasks.

Buildah is also included in larger projects of the cloud-native community, such as the Shipwright project (https://github.com/shipwright-io/build).

Shipwright is an extensible build framework for Kubernetes that provides the flexibility of customizing image builds using custom resource definitions and different build tools. Buildah is one of the available solutions that you can choose when designing your build processes with it.

We will see more detailed and richer examples in the next subsections. Now that we have seen an overview of Buildah's capabilities and use cases, let's dive into the installation and environment preparation steps.

Preparing our environment

Buildah is available on different distributions and can be installed using the respective package managers. This section provides a non-exhaustive list of installation examples on the major distributions. For the sake of clarity, it is important to reiterate that the book lab environments were all based on Fedora 34:

  • Fedora: To install Buildah on Fedora, run the following dnf command:

    $ sudo dnf -y install buildah

  • Debian: To install Buildah on Debian Bullseye or later, run the following apt-get commands:

    $ sudo apt-get update

    $ sudo apt-get -y install buildah

  • CentOS: To install Buildah on CentOS, run the following yum command:

    $ sudo yum install -y buildah

  • RHEL8: To install Buildah on RHEL8, run the following yum module commands:

    $ sudo yum module enable -y container-tools:1.0

    $ sudo yum module install -y buildah

  • RHEL7: To install Buildah on RHEL7, enable the rhel-7-server-extras-rpms repository and install with yum:

    $ sudo subscription-manager repos --enable=rhel-7-server-extras-rpms

    $ sudo yum -y install buildah

  • Arch Linux: To install Buildah on Arch Linux, run the following pacman command:

    $ sudo pacman –S buildah

  • Ubuntu: To install Buildah on Ubuntu 20.10 or later, run the following apt-get commands:

    $ sudo apt-get -y update

    $ sudo apt-get -y install buildah

  • Gentoo: To install Buildah on Gentoo, run the following emerge command:

    $ sudo emerge app-emulation/libpod

  • Build from source: Buildah can also be built from the source. For the purpose of this book, we will keep the focus on simple deployment methods, but if you're curious, you will find the following guide useful to try out your own builds: https://github.com/containers/buildah/blob/main/install.md#building-from-scratch.

Finally, Buildah can be deployed as a container, and builds can be executed inside it with a nested approach. This process will be covered in greater detail in Chapter 7, Integrating with Existing Application Build Processes.

After installing Buildah to our host, we can move on to verifying our installation.

Verifying the installation

After installing Buildah, we can now run some basic test commands to verify the installation.

To see all the available images in the host local store, use the following commands:

$ buildah images

# buildah images

The image list will be the same as the one printed by the podman images command since they share the same local store.

Also note that the two commands are executed as an unprivileged user and as root, pointing respectively to the user rootless local store and the system-wide local store.

We can run a simple test build to verify the installation. This is a good chance to test a basic build script whose only purpose is to verify whether Buildah is able to fully run a complete build.

For the purpose of this book (and for fun), we have created the following simple test script that creates a minimal Python 3 image:

#!/bin/bash

BASE_IMAGE=alpine

TARGET_IMAGE=python3-minimal

if [ $UID != 0 ]; then

    echo "### Running build test as unprivileged user"

else

    echo "### Running build test as root"

fi

echo "### Testing container creation"

container=$(buildah from $BASE_IMAGE)

if [ $? -ne 0 ]; then

    echo "Error initializing working container"

fi

echo "### Testing run command"

buildah run $container apk add --update python3 py3-pip

if [ $? -ne 0 ]; then

    echo "Error on run build action"

fi

echo "### Testing image commit"

buildah commit $container $TARGET_IMAGE

if [ $? -ne 0 ]; then

    echo "Error committing final image"

fi

echo "### Removing working container"

buildah rm $container

if [ $? -ne 0 ]; then

    echo "Error removing working container"

fi

echo "### Build test completed successfully!"

exit 0

The same test script can be executed by a non-privileged user and by root.

We can verify the newly built image by running a simple container that executes a Python shell:

$ podman run -it python3-minimal /usr/bin/python3

Python 3.9.5 (default, May 12 2021, 20:44:22)

[GCC 10.3.1 20210424] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>>

After successfully testing our new Buildah installation, let's inspect the main configuration files used by Buildah.

Buildah configuration files

The main Buildah configuration files are the same ones used by Podman. They can be leveraged to customize the behavior of the working containers executed in builds.

On Fedora, these config files are installed by the containers-common package, and we already covered them in the Prepare your environment section in Chapter 3, Running the First Container.

The main config files used by Buildah are as follows:

  • /usr/share/containers/mounts.conf: This config file defines the files and directories that are automatically mounted inside a Buildah working container.
  • /etc/containers/registries.conf: This config file has the role of managing registries allowed to be accessed for image searches, pulls, and pushes.
  • /usr/share/containers/policy.json: This JSON config file defines image signature verification behavior.
  • /usr/share/containers/seccomp.json: This JSON config file defines the allowed and prohibited syscalls to a containerized process.

In this section, we have learned how to prepare the host environment to run Buildah. In the next section, we are going to identify the possible build strategies that can be implemented with Buildah.

Choosing our build strategy

There are basically three types of build strategies that we can use with Buildah:

  • Building a container image starting from an existing base image
  • Building a container image starting from scratch
  • Building a container image starting from a Dockerfile

We have already provided an example of the build strategy from an existing base image in the Meet Buildah, Podman's companion section. Since this strategy is pretty similar from a workflow point of view to building from scratch, we will focus our practical examples on the last one, which provides great flexibility to create a small footprint and secure images.

Before going through the various technical details in the next section, let's start exploring all these strategies at a high level.

Even though we can find a lot of prebuilt container images available on the most popular public container registries, sometimes we might not be able to find a particular configuration, setup, or bundle of tools and services for our containers; that is why container image creation becomes a really important step that we need to practice.

Also, security constraints often require us to implement images with reduced attack surfaces, and therefore, DevOps teams must know how to customize every step of the build process to achieve this result.

With this awareness in mind, let's start with the first build strategy.

Building a container image starting from an existing base image

Let's imagine finding a well-done prebuilt container image for our favorite application server that our company is widely using. All the configurations for this container image are okay, and we can attach storage to the right mount points to persist the data and so on, but sooner or later, we may realize that some particular tools that we use for troubleshooting are missing in the container image, or that some libraries are missing that should be included!

In another scenario, we could be happy with the prebuilt image but still need to add custom contents to it – for example, the customer application.

What would be the solution in those cases?

In this first use case, we can extend the existing container image, adding stuff and editing the existing files to suit our purposes. In the previous basic examples, Fedora and Alpine images were customized to serve different purposes. Those images were generic OS filesystems with no specific purpose, but the same concept can be applied to a more complex image.

In the second use case, we can customize an image – for example, the default library Httpd. We can install PHP modules and then add our application's PHP files, producing a new image with our custom contents already built in.

We will see in the next sections how we can extend an existing container image.

Let's move on to the second strategy.

Building a container image starting from scratch

The previous strategy would be enough for many common situations, where we can find a prebuilt image to start working with, but sometimes it may be that the particular use case, application, or service that we want to containerize is not so common or widely used.

Imagine having a custom legacy application that requires some old libraries and tools that are no longer included on the latest Linux distribution or that may have been replaced by more recent ones. In this scenario, you might need to start from an empty container image and add piece by piece all the necessary stuff for your legacy application.

We have learned in this chapter that, actually, we will always start from a sort of initial container image, so this strategy and the previous one are pretty much the same.

Let's move on to the third and final strategy.

Building a container image starting from a Dockerfile

In Chapter 1, Introduction to Container Technology, we talked about container technology history and how Docker gained momentum in that context. Podman was born as an alternative evolution project of the great concepts that Docker helped to develop until now. One of the great innovations that Docker created in its own project history is, for sure, the Dockerfile.

Looking into this strategy at a high level, we can affirm that even when using a Dockerfile, we will arrive at one of the previous build strategies. The reality is not far away from the latest assumption we made, because Buildah under the hood will parse the Dockerfile, and it will build the container that we briefly introduced for previous build strategies.

So, in summary, are there any differences or advantages we need to consider when choosing our default build strategy? Obviously, there is no ultimate answer to this question. First of all, we should always look into the container communities, searching for some prebuilt image that could help our build process; on the other hand, we can always fall back on the build from scratch process. Last but not least, we can consider Dockerfile for easily distributing and sharing our build steps with our development group or the wider container communities.

This ends up our quick high-level introduction; we can now move on to the practical examples!

Building images from scratch

Before going into the details of this section and learning how to build a container image from scratch, let's make some tests to verify that the installed Buildah is working properly.

First of all, let's check whether our Buildah image cache is empty:

# buildah images

REPOSITORY   TAG   IMAGE ID   CREATED   SIZE

# buildah containers -a

CONTAINER ID  BUILDER  IMAGE ID     IMAGE NAME                       CONTAINER NAME

Important Note

Podman and Buildah share the same container storage; for this reason, if you previously ran any other example shown in this chapter or book, you will find that your container storage cache is not that empty!

As we learned in the previous section, we can leverage the fact that Buildah will output the name of the just-created working container to easily store it in an environment variable and use it once needed. Let's create a brand-new container from scratch:

# buildah from scratch

# buildah images

REPOSITORY   TAG   IMAGE ID   CREATED   SIZE

# buildah containers

CONTAINER ID  BUILDER  IMAGE ID     IMAGE NAME                       CONTAINER NAME

af69b9547db9     *                  scratch                          working-container

As you can see, we used the special from scratch keywords that are telling Buildah to create an empty container with no data inside it. If we run the buildah images command, we will note that this special image is not listed.

Let's check whether the container really is empty:

# buildah run working-container bash

2021-10-26T20:15:49.000397390Z: executable file 'bash' not found in $PATH: No such file or directory

error running container: error from crun creating container for [bash]: : exit status 1

error while running runtime: exit status 1

No executable was found in our empty container – what a surprise! The reason is that the working container has been created on an empty filesystem.

Let's see how we can easily fill this empty container. In the following example, we will interact directly with the underlying storage, using the package manager of our host system to install the binaries and the libraries needed for running a bash shell in our container image.

First of all, let's instruct Buildah to mount the container storage and check where it resides:

# buildah mount working-container

/var/lib/containers/storage/overlay/b5034cc80252b6f4af2155f 9e0a2a7e65b77dadec7217bd2442084b1f4449c1a/merged

Good to Know

If you start the build in rootless mode, Buildah will run the mount in a different namespace, and for this reason, the mounted volume might not be accessible from the host when using a driver different than vfs.

Great! Now that we have found it, we can leverage the host package manager to install all the needed packages in this root folder, which will be the root path of our container image:

# scratchmount=$(buildah mount working-container)

# dnf install --installroot $scratchmount --releasever 34 bash coreutils --setopt install_weak_deps=false -y

Important Note

If you are running the previous command on a Fedora release different than version 34, (for example, version 35), then you need to import the GPG public keys of Fedora 34 or use the --nogpgcheck option.

First of all, we will save the very long directory path in an environment variable and then execute the dnf package manager, passing the just-obtained directory path as the install root directory, setting the release version of our Fedora OS, specifying the packages that we want to install (bash and coreutils), and finally, disabling weak dependency, accepting all the changes to the system.

The command should end up with a Complete! statement; once done, let's try again with the same command that we saw failing earlier in this section:

# buildah run working-container bash

bash-5.1# cat /etc/fedora-release

Fedora release 34 (Thirty Four)

It worked! We just installed a Bash shell in our empty container. Let's see now how to finish our image creation with some other configuration steps. First of all, we need to add to our final container image a command to be run once it is up and running. For this reason, we will create a Bash script file with some basic commands:

# cat command.sh

#!/bin/bash

cat /etc/fedora-release

/usr/bin/date

We have created a Bash script file that prints the Fedora release of the container and the system date. The file must have execute permissions before being copied:

# chmod +x command.sh

Now that we have filled up our underlying container storage with all the needed base packages, we can unmount the working-container storage and use the buildah copy command to inject files from the host to the container:

# buildah unmount working-container

af69b9547db93a7dc09b96a39bf5f7bc614a7ebd29435205d358e09ac 99857bc

# buildah copy working-container ./command.sh /usr/bin

659a229354bdef3f9104208d5812c51a77b2377afa5ac819e3c3a1a2887eb9f7

The buildah copy command gives us the ability to work with the underlying storage without worrying about mounting it or handling it under the hood.

We are now ready to complete our container image by adding some metadata to it:

# buildah config --cmd /usr/bin/command.sh working-container

# buildah config --created-by "podman book example" working-container

# buildah config --label name=fedora-date working-container

We started with the cmd option, and after that, we added some descriptive metadata. We can finally commit our working-container into an image!

# buildah commit working-container fedora-date

Getting image source signatures

Copying blob 939ac17066d4 done  

Copying config e24a2fafde done  

Writing manifest to image destination

Storing signatures

e24a2fafdeb5658992dcea9903f0640631ac444271ed716d7f749eea7a651487

Let's clean up the environment and check the available container images into the host:

# buildah rm working-container

af69b9547db93a7dc09b96a39bf5f7bc614a7ebd29435205d358e09ac99857bc

We can now inspect the details of the just-created container image:

# podman images

REPOSITORY             TAG         IMAGE ID      CREATED             SIZE

localhost/fedora-date  latest      e24a2fafdeb5  About a minute ago  366 MB

# podman inspect localhost/fedora-date:latest

[...omitted output]        "Labels": {

            "io.buildah.version": "1.23.1",

            "name": "fedora-date"

        },

        "Annotations": {

            "org.opencontainers.image.base.digest": "",

            "org.opencontainers.image.base.name": ""

        },

        "ManifestType": "application/vnd.oci.image.manifest.v1+json",

        "User": "",

        "History": [

            {

                "created": "2021-10-26T21:16:48.777712056Z",

                "created_by": "podman book example"

            }

        ],

        "NamesHistory": [

            "localhost/fedora-date:latest"

        ]

    }

]

As we can see from the previous output, the container image has a lot of metadata that can tell us many details. Some of them we set through the previous commands, such as the created_by, name, and Cmd tags; the other tags are populated automatically by Buildah.

Finally, let's run our brand-new container image with Podman!

# podman run -ti localhost/fedora-date:latest

Fedora release 34 (Thirty Four)

Tue Oct 26 21:18:29 UTC 2021

This ends our journey in creating a container image from scratch. As we saw, this is not a typical method for creating a container image; in many scenarios and for various use cases, it can be enough to start with an OS base image, such as from fedora or from alpine, and then add the required packages, using the respective package managers available in those images.

Good to Know

Some Linux distributions also provide base container images in a minimal flavor (for example, fedora-minimal) that reduce the number of packages installed, as well as the size of the target container image. For more information, refer to https://www.docker.com/ and https://quay.io/!

Let's now inspect how to build images from Dockerfiles with Buildah.

Building images from a Dockerfile

As we described earlier in this chapter, the Dockerfile can be an easy option to create and share the build steps for creating a container image, and for this reason, it is really easy to find a lot of source Dockerfiles on the net.

The first step of this activity is to build a simple Dockerfile to work with. Let's create a Dockerfile for creating a containerized web server:

# mkdir webserver

# cd webserver/

[webserver]# vi Dockerfile

[webserver]# cat Dockerfile

# Start from latest fedora container base image

FROM fedora:latest

MAINTAINER podman-book  # this should be an email

# Update the container base image

RUN echo "Updating all fedora packages"; dnf -y update; dnf -y clean all

# Install the httpd package

RUN echo "Installing httpd"; dnf -y install httpd

# Expose the http port 80

EXPOSE 80

# Set the default command to run once the container will be started

CMD ["/usr/sbin/httpd", "-DFOREGROUND"]

Looking at the previous output, we first created a new directory, and inside, we created a text file named Dockerfile. After that, we inserted the various keywords and steps commonly used in the definition of a brand-new Dockerfile; every step and keyword has a dedicated description comment on top, so the file should be easy to read.

Just to recap, these are the steps contained in our brand-new Dockerfile:

  1. Start from the latest Fedora container base image.
  2. Update all the packages for the container base image.
  3. Install the httpd package.
  4. Expose HTTP port 80.
  5. Set the default command to run once the container is started.

As seen previously in this chapter, Buildah provides a dedicated buildah build command to start a build from a Dockerfile.

Let's see how it works:

[webserver]# buildah build -f Dockerfile -t myhttpdservice .

STEP 1/6: FROM fedora:latest

Resolved "fedora" as an alias (/etc/containers/registries.conf.d/000-shortnames.conf)

Trying to pull registry.fedoraproject.org/fedora:latest...

Getting image source signatures

Copying blob 944c4b241113 done  

Copying config 191682d672 done  

Writing manifest to image destination

Storing signatures

STEP 2/6: MAINTAINER podman-book  # this should be an email

STEP 3/6: RUN echo "Updating all fedora packages"; dnf -y update; dnf -y clean all

Updating all fedora packages

Fedora 34 - x86_64                               16 MB/s |  74 MB     00:04  

...

STEP 4/6: RUN echo "Installing httpd"; dnf -y install httpd

Installing httpd

Fedora 34 - x86_64                               20 MB/s |  74 MB     00:03    

...

STEP 5/6: EXPOSE 80

STEP 6/6: CMD ["/usr/sbin/httpd", "-DFOREGROUND"]

COMMIT myhttpdservice

Getting image source signatures

Copying blob 7500ce202ad6 skipped: already exists  

Copying blob 51b52d291273 done  

Copying config 14a2226710 done  

Writing manifest to image destination

Storing signatures

--> 14a2226710e

Successfully tagged localhost/myhttpdservice:latest

14a2226710e7e18d2e4b6478e09a9f55e60e0666dd8243322402ecf6fd1eaa0d

As we can see from the previous output, we pass the following options to the buildah build command:

  • -f: To define the name of the Dockerfile. The default filename is Dockerfile, so in our case, we can omit this option because we named the file as the default one.
  • -t: To define the name and the tag of the image we are building. In our case, we are only defining the name. The image will be tagged latest by default.
  • Finally, as the last option, we need to set the directory where Buildah needs to work and search for the Dockerfile. In our case, we are passing the current . directory.

Of course, these are not the only options that Buildah gives us to configure the build; we will see some of them later in this section.

Coming back to the command we just executed, as we can see from the output, all the steps defined in the Dockerfile have been executed in the exact written order and printed with a given fractional number to show the intermediate steps against the total number. In total, six steps were executed.

We can check the result of our command by listing the images with the buildah images command:

[webserver]# buildah images

REPOSITORY                                  TAG      IMAGE ID       CREATED          SIZE

localhost/myhttpdservice                    latest   14a2226710e7   2 minutes ago   497 MB

As we can see, our container image has just been created with the latest tag; let's try to run it:

# podman run -d localhost/myhttpdservice:latest

133584ab526faaf7af958da590e14dd533256b60c10f08acba6c1209ca05a885

# podman logs 133584ab526faaf7af958da590e14dd533256b60c10f08acba6c1209ca05a885

AH00558: httpd: Could not reliably determine the server's fully qualified domain name, using 10.88.0.4. Set the 'ServerName' directive globally to suppress this message

# curl 10.88.0.4

<!doctype html>

<html>

  <head>

    <meta charset='utf-8'>

    <meta name='viewport' content='width=device-width, initial-scale=1'>

    <title>Test Page for the HTTP Server on Fedora</title>

    <style type="text/css">

...

Looking at the output, we just ran our container in detached mode; after that, we inspected the logs to find out the IP address that we need to pass as an argument for the curl test command.

We just run the container as the root user on our workstation, and the container just received an internal IP address on Podman's container network interface. We can check that the IP address is part of that network by running the following commands:

# ip a show dev cni-podman0

14: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000

    link/ether c6:bc:ba:7c:d3:0c brd ff:ff:ff:ff:ff:ff

    inet 10.88.0.1/16 brd 10.88.255.255 scope global cni-podman0

       valid_lft forever preferred_lft forever

    inet6 fe80::c4bc:baff:fe7c:d30c/64 scope link

       valid_lft forever preferred_lft forever

As we can see, the container's IP address was taken from the network reported in the previous 10.88.0.1/16 output.

As we anticipated, the buildah build command has a lot of other options that can be useful while developing and creating brand-new container images. Let's explore one of them that is worth mentioning – --layers.

We already learned how to use this option with Podman earlier in this chapter. Starting from version 1.2 of Buildah, the development team added this great option that gives us the ability to enable or disable the layers' caching mechanism. The default configuration sets the --layers option to false, which means that Buildah will not keep intermediate layers, resulting in a build that squashes all the changes in a single layer.

It is also possible to set the management of the layers with an environment variable – for example, to enable layer caching, run export BUILDAH_LAYERS=true.

Obviously, the downside of this option is that the retained layers actually use storage space on the system host, but on the other hand, we can save computational power if we need to rebuild a given image, changing only the latest layers and without rebuilding the whole image!

Summary

In this chapter, we explored a fundamental topic of container management – their creation. This step is mandatory if we want to customize, keep updated, and manage our container infrastructure correctly. We learned that Podman is often partnered with another tool called Buildah that can help us in the process of container image building. This tool has a lot of options, like Podman, and shares a lot of them with it (storage included!). Finally, we went through the different strategies that Buildah offers us to build new container images, and one of them is actually inherited by the Docker ecosystem – the Dockerfile.

This chapter is only an introduction to the topic of container image building; we will discover more advanced techniques in the next chapter!

Further reading

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset