Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8
Deploying microservices

This chapter covers

Why it’s crucial to get deployment right in a microservice application
The fundamental components of a microservice production environment
Deploying a service to a public cloud
Packaging a service as an immutable artifact

Mature deployment practices are crucial to building reliable and stable microservices. Unlike a monolithic application, where you can optimize deployment for a single use case, microservice deployment practices need to scale to multiple services, written in different languages, each with their own dependencies. You need to be able to trust your deployment process to push out new features — and new services — without harming overall availability or introducing critical defects.

As a microservice application evolves at the level of deployable units, the cost of deploying new services must be negligible to enable engineers to rapidly innovate and deliver value to users. The added development speed you gain from microservices will be wasted if you can’t get them to production rapidly and reliably. Automated deployments are essential to developing microservices at scale.

In this chapter, we’ll explore the components of a microservice production environment. Following that, we’ll look at some deployment building blocks — such as artifacts and rolling updates — and how they apply to microservices. Throughout the chapter, we’ll work with a simple service — market-data — to try out different approaches to packaging and deployment using a well-known cloud service, Google Cloud Platform. You can find a starting point for this service in the book’s repository on Github (https://github.com/morganjbruce/microservices-in-action).

8.1 Why is deployment important?

Deployment is the riskiest moment in the lifecycle of a software system. The closest real-world equivalent would be changing a tire — except the car is still moving at 100 miles an hour. No company is immune to this risk: for example, Google’s site reliability team identified that roughly 70% of outages are due to changes in a live system (https://landing.google.com/sre/book/chapters/introduction.html).

Microservices drastically increase the number of moving parts in a system, which increases the complexity of deployment. You’ll face four challenges when deploying microservices (figure 8.1):

Maintaining stability when facing a high volume of releases and component changes
Avoiding tight coupling between components leading to build- or release-time dependencies
Releasing breaking changes to the API of a service, which may negatively impact that service’s clients
Retiring services

When you do them well, deployments are based on simplicity and predictability. A consistent build pipeline produces predictable artifacts, which you can apply atomically to a production environment.

Figure 8.1 A high-level view of production deployment

8.1.1 Stability and availability

In an ideal world, deployment is “boring:” not unexciting, but incident-free. We’ve seen too many teams — both monolithic and microservice — that experience deploying software as incredibly stressful. But if working with microservices means you’re releasing more components more frequently, doesn’t that mean you’re introducing more risk and instability into a system?

Manual change management is costly

Traditional change management methodologies attempt to reduce deployment risk by introducing governance and ceremony. Changes must go through numerous quality gates and formal approvals, usually human-driven. Although this is intended to ensure that only working code reaches production, this approach is costly to apply and doesn’t scale well to multiple services.

Small releases reduce risk and increase predictability

The larger a release, the higher the risk of introducing defects. Naturally, microservice releases are smaller because the codebases are smaller. And that’s the trick — by releasing smaller changes more often, you reduce the total impact of any single change. Rather than stopping everything for a deployment, you can design your services and deployment approaches with the expectation that they’ll face continuous change. Reducing the surface area of possible change leads to releases that are quicker, easier to monitor, and less disruptive to the smooth functioning of an application.

Automation drives deployment pace and consistency

Even if your releases are smaller, you still need to make sure your change sets are as free from defects as possible. You can achieve this by automating the process of commit validation — unit tests, integration tests, linting, and so on — and the process of rollout — applying those changes in the production environment. This helps you to build systematic confidence in the code changes you’re making and apply consistent practices across multiple services.

8.2 A microservice production environment

Deployment is a combination of process and architecture:

The process of taking code, making it work, and keeping it working
The architecture of the environment in which the software is operated

Production environments for running microservices vary widely, as do monolith production environments. What’s appropriate for your application may depend on your organization’s existing infrastructure, technical capabilities, and attitude toward risk, as well as regulatory requirements.

8.2.1 Features of a microservice production environment

The production environment for a microservice application needs to provide several capabilities to support the smooth operation of multiple services. Figure 8.2 gives a high-level view of the capabilities of the production environment.

A microservice production environment has six fundamental capabilities:

A deployment target, or runtime platform, where services are run, such as virtual machines (Ideally, engineers can use an API to configure, deploy, and update service configuration. You also could call this API the control pane, as shown in the figure.)
Runtime management, such as autohealing and autoscaling, that allows the service environment to respond dynamically to failure or changes in load without human intervention (For example, if a service instance fails, it should automatically be replaced.)
Logging and monitoring to observe service operation and provide insight for engineers into how services are behaving
Support for secure operation, such as network controls, secret management, and application hardening
Load balancers, DNS, and other routing components to route requests from users and between microservices
A deployment pipeline that delivers services from code, safely into operational usage in the production environment

These components are part of the platform layer of the microservice architecture stack.

Figure 8.2 A microservice production environment

8.2.2 Automation and speed

Along with the six fundamental features, two factors are key in assessing the suitability of a deployment platform for a microservice application:

Automation — The bulk of infrastructural management and configuration, such as spinning up a new host, should be highly amenable to automation, ideally by the team developing services themselves.
Speed — If a significant cost is associated with every new deploy — whether obtaining infrastructure resources or setting up a new deployment — then a microservice approach will be significantly hampered.

Although you may not always have the luxury of choosing your deployment environment, it’s important to appreciate how different platforms might affect these characteristics and how you develop your microservice application. I once worked for a company that took six weeks to provision each new server. Suffice it to say that taking new services into production was an exhausting endeavor!

It’s not coincidental that the popularity of microservice architecture coincides with the wider adoption of DevOps practices, such as infrastructure as code, and the increasing use of cloud providers to run applications. These practices enable rapid iteration and deployment of services, which in turn makes a microservice architecture a scalable and feasible approach.

When possible you should aim to use a public infrastructure as a service (IaaS) cloud, such as Google Cloud Platform (GCP), AWS, or Microsoft Azure, for deploying any nontrivial microservice application. These cloud services offer a wide range of features and tools that ease the development of a robust microservice platform at a lower level of abstraction than a higher level deployment solution (such as Heroku). As such, they offer more flexibility. In the next section, we’ll show you how to use GCP to deploy, access, and scale a microservice.

8.3 Deploying a service, the quick way

It’s time to get your hands dirty and deploy a service. You need to take your code, get it running on a virtual machine, and make it accessible from the outside world — as figure 8.3 illustrates.

Figure 8.3 A simple microservice deployment

You’ll use Google Compute Engine (GCE) as a production environment. This is a service on GCP that you can use to run virtual machines. You can sign up for a free trial GCP subscription, which will have enough credit for this chapter’s examples. Although the operations you’ll perform are specific to this platform, all major cloud providers, such as AWS and Azure, provide similar abstractions.

To interact with GCE, you’ll use the gcloud command-line tool. This tool interacts with the GCE API to perform operations on your cloud account. You can find install instructions in the GCP documentation (https://cloud.google.com/sdk/docs/quickstarts). It’s not the only option — you could use third-party tools like Ansible or Terraform instead.

Assuming you’ve followed the install instructions and logged in with gcloud init, you can create a new project:

gcloud projects create <project-id> --set-as
➥-default --enable-cloud-apis    ①

This project will contain the resources that’ll run your service.

8.3.1 Service startup

To run your service, you’ll use a startup script, which will be executed at startup time when Google Cloud provisions your machine. We’ve written this for you already — you can find it at chapter-8/market-data/startup-script.sh.

Take your time to read through the script, which performs four key tasks:

Installs binary dependencies required to run a Python application
Downloads your service code from Github
Installs that code’s dependencies, such as the flask library
Configures a supervisor to run the Python service using the Gunicorn web server

Now, let’s try it out.

8.3.2 Provisioning a virtual machine

You can provision a virtual machine from the command line. Change to the chapter-8/market-data directory and run the following command:

gcloud compute instances create market-data-service     ①  
  --image-family=debian-9     ②  
  --image-project=debian-cloud     ②  
  --machine-type=g1-small     ③  
  --scopes userinfo-email,cloud-platform 
  --metadata-from-file startup-script=startup
➥-script.sh     ④  
  --tags api-service     ⑤  
  --zone=europe-west1-b    ⑥

This will create a machine and return the machine’s external IP address — something like figure 8.4.

This approach to startup does take a while. If you want to watch the progress of the startup process, you can tail the output of the virtual machine’s serial port:

gcloud compute instances tail-serial-port-output market-data-service

Once the startup process has completed, you should see a message in the log, similar to this example:

Mar 16 12:17:14 market-data-service-1 systemd[1]: Startup finished in
➥ 1.880s (kernel) + 1min 52.486s (userspace) = 1min 54.367s.

Great! You’ve got a running service — although you can’t call it yet. You’ll need to open the firewall to make an external call to this service. Running the following command will open up public access to port 8080 for all services with the tag api-service:

gcloud compute firewall-rules create default-allow-http-8080 
  --allow tcp:8080     ①  
  --source-ranges 0.0.0.0/0     ②  
  --target-tags api-service     ③  
  --description "Allow port 8080 access to api-service"

You can test your service by curling the external IP of the virtual machine. The external IP was returned when you created the instance (figure 8.4). If you didn’t note it, you can retrieve all instances by running gcloud compute instances list. Here’s the curl:

curl -R http://<EXTERNAL-IP>:8080/ping    ①

If all is going well, the response you get will be the name of the virtual machine — market-data-service.

Figure 8.4 Information about a newly created virtual machine

8.3.3 Run multiple instances of your service

It’s unlikely you’ll ever run a single instance of a microservice:

You’ll want to scale horizontally (the X-axis of scalability) by deploying multiple clones of the same service, each handling a proportion of requests. Although you could serve more requests with progressively larger machines, it’s ultimately possible to scale further using more machines.
It’s important to deploy with redundancy to ensure that failures are isolated. A single instance of a service won’t maximize resiliency when failures occur.

Figure 8.5 illustrates a service group. Requests made to the logical service, market-data, are load balanced to underlying market-data instances. This is a typical production configuration for a stateless microservice.

You can try this out. On GCE, a group of virtual machines is called an instance group (or on AWS, it’s an auto-scaling group). To create a group, you first need to create an instance template:

gcloud compute instance-templates create market-data-service-template 
  --machine-type g1-small 
  --image-family debian-9 
  --image-project debian-cloud 
  --metadata-from-file startup-script=startup-script.sh 
  --tags api-service 
  --scopes userinfo-email,cloud-platform

Figure 8.5 A service group and load balancer

Running this code will create a template to build multiple market-data-service instances like the one you built earlier. Once the template has been set up, create a group:

gcloud compute instance-groups managed create market-data-service-group 
  --base-instance-name market-data-service     ①  
  --size 3     ②  
  --template market-data-service-template     ③  
  --region europe-west1    ④

This will spin up three instances of your market-data service. If you open the Google Cloud console and navigate to Compute Engine > Instance Groups, you should see a list like the one in figure 8.6.

Using an instance template to build a group gives you some interesting capabilities out of the box: failure zones and self-healing. These two features are crucial to operating a resilient microservice.

Failure zones

First, note the zone column in figure 8.6. It lists three distinct values: europe-west1-d, europe-west1-c, and europe-west1-b. Each of these zones represents a distinct data center. If one of those data centers fails, that failure will be isolated and will only affect 33% of your service capacity.

Self-healing

If you select one of those instances, you’ll see the option to delete that instance (figure 8.7). Give it a shot!

Figure 8.6 Instances within an instance group

Deleting an instance will cause the instance group to spin up a replacement instance, ensuring that capacity is maintained. If you look at the operation history of the project (Compute Engine > Operations), you’ll see that the delete operation results in GCE automatically recreating the instance (figure 8.8).

The instance group will attempt to self-heal in response to any event that results in an instance falling out of service, such as underlying machine failure. You can improve this by adding a health check that also targets your application:

gcloud compute health-checks create http api-health-check 
  --port=8080     ①  
  --request-path="/ping"    ①  

gcloud beta compute instance-groups managed set
➥-autohealing     ②  
  market-data-service-group     ②  
  --region=europe-west1     ②  
  --http-health-check=api-health-check    ②

Now, with the addition of the health check, whenever the application fails to reply to it, the virtual machine will be recycled.

Adding capacity

As your service is now deployed from a template, it’s trivial to add more capacity. You can resize the group from the command line:

gcloud compute instance-groups managed resize market-data-service-group 
--size=6     ①  
--region=europe-west1

You also can add autoscaling rules to automatically add more capacity if metrics you observe from your group, such as average CPU utilization, pass a given threshold.

8.3.4 Adding a load balancer

In all that excitement, you forgot to expose your service group to the wild! In this case, GCE will provide your load balancer, which consists of a few interconnected components, as outlined in figure 8.9. The load balancer uses these routing rules, proxies, and maps to forward requests from the outside world to a set of healthy service instances.

Figure 8.8 Deleting an instance in a group results in the instance being recreated to maintain target capacity (from bottom to top)

Figure 8.9 Request lifecycle for GCE load balancing

First, you’ll want to add a backend service, which is the most important component of your load balancer because it’s responsible for directing traffic optimally to underlying instances:

gcloud compute instance-groups managed set-named-ports 
  market-data-service-group 
  --named-ports http:8080 
  --region europe-west1

gcloud compute backend-services create 
➥market-data-service     ①  
  --protocol HTTP 
  --health-checks api-health-check     ②  
  --global

This code creates two entities: a named part, identifying the port your service exposes, and a backend service, which uses the http health check you created earlier to test the health of your service.

Next, you need a URL map and a proxy:

gcloud compute url-maps create api-map 
  --default-service market-data-service    ①  

gcloud compute target-http-proxies create api-proxy 
  --url-map api-map    ②

If you had more than one service, you could use the map to route different subdomains to different backends. In this case, the URL map will direct all requests, regardless of URL, to the market-data-service you created earlier.

Finally, you need to create a static IP address for your service and a forwarding rule that connects that IP to the HTTP proxy you’ve created:

gcloud compute addresses create market-data-service-ip 
  --ip-version=IPV4 
  --global

export IP=`gcloud compute addresses describe market
➥-data-service-ip --global --format json | jq –raw
➥-output '.address'`    ①  

gcloud compute forwarding-rules create 
➥api-forwarding-rule     ②  
  --address $IP     ②  
  --global     ②  
  --target-http-proxy api-proxy     ②  
  --ports 80    ②  
printenv IP    ③

This code creates a public IP address and configures requests to that IP to be forwarded to your HTTP proxy and on to your backend service. Once run, these rules take several minutes to propagate. After a wait, try to curl the service — curl "http://$IP/ping?[1-100]". That will start you with 100 requests. If you see the names of different market-data nodes being output to your terminal — terrific — you’ve deployed a load-balanced microservice!

8.3.5 What have you learned?

In these examples, you’ve built some of the key elements of a microservice deployment process:

Using an instance template established a primitive deployment operation, making it simple to add and remove capacity for a given service.
Combining instance groups, load balancers, and health checks allowed you to autoscale and autoheal your microservice deployment.
Deploying into independent zones helped you build bulwarks to limit the impact of failures.

But a few things are missing. Your releases weren’t predictable, because you pulled your latest code and compiled it on the machine. A new code commit could cause different service instances to be running inconsistent versions of the code (figure 8.10). Without any explicit versioning or packaging, there would be no easy way to roll your code forward or back.

Figure 8.10 Releasing without packaged versions results in deploying inconsistent code.

The process of starting machines was slow because you made pulling dependencies part of startup, rather than baking them into your instance template. This arrangement also meant that the dependencies could become inconsistent across different instances.

Lastly, you didn’t automate anything. Not only will a manual process not scale to multiple microservices, but it’s likely to be error prone. Over the next few sections and chapters, you can make this much better.

8.4 Building service artifacts

In the earlier deployment example, you didn’t package your code for deployment. The startup script that you ran on each node pulled code from a Git repository, installed some dependencies, and started your application. That worked, but it was flawed:

Starting up the application was slow, as each node performed the same pull and build steps in parallel.
There was no guarantee that each node was running the same version of your service.

This made your deployment unpredictable — and fragile. To get the benefits you want, you need to build a service artifact. A service artifact is an immutable and deterministic package for your service. If you run the build process again for the same commit, it should result in an equivalent artifact.

Most technology stacks offer some sort of deployment artifact (for example, JAR files in Java, DLLs in .NET, gems in Ruby, and packages in Python). The runtime characteristics of these artifacts might differ. For instance, you need to run .NET web services using an IIS server whereas JARs may be self-executable, embedding a server process like Tomcat.

Figure 8.11 An artifact repository stores service artifacts that a build automation tool constructs and you can pull later for deployment.

Figure 8.11 illustrates the artifact construction, storage, and deployment process. Typically, a build automation tool (such as Jenkins or CircleCI) builds a service artifact and pushes it to an artifact repository. An artifact repository might be a dedicated tool — for example, Docker provides a registry for storing images — or a generic file storage tool, such as Amazon S3.

8.4.1 What’s in an artifact?

A microservice isn’t only code; it’ll have many constituent parts:

Your application code, compiled or not (depending on programming language)
Application libraries
Binary dependencies (for example, ImageMagick or libssl) that are installed on the operating system
Supporting processes, such as logging or cron
External dependencies, such as data stores, load balancers, or other services

Some of these dependencies, such as application libraries, are explicitly defined. Others may be implicit; for example, language-specific package managers are often ignorant of binary dependencies. Figure 8.12 illustrates these different parts.

An ideal deployment artifact for a microservice would allow you to package up a specific version of your compiled code, specifying any binary dependencies, and provide a standard operational abstraction for starting and stopping that service. This should be environment-agnostic: you should be able to run the same artifact locally, in test, and in production. By abstracting out differences between languages at runtime, you both reduce cognitive load and provide common abstractions for managing those services.

Figure 8.12 A service with internal and external dependencies

8.4.2 Immutability

We’ve touched on immutability a few times so far — let’s take a moment to look at why it matters. An immutable artifact, encapsulating as many dependencies of your service as feasible, gives you the highest possible confidence that the package you tested throughout your deployment pipeline will be the same as what is deployed in production. Immutability also allows you to treat your service instances as disposable — if a service develops a problem, you can easily replace it with a new instance of the last known good state. On GCE, this autohealing process was automated by the instance group you created.

If a build of the same code can result in a different artifact being created — for example, pulling different versions of dependencies — you increase the risk in deployment and the fragility of your code because unintentional changes can be included in a release. Immutability increases the predictability of your system, as it’s easier to reason through a system’s state and recreate a historic state of your application — crucial for rollback.

Immutability and server management

Immutability isn’t only for service artifacts: it’s also an important principle for effective virtual server management.

One approach to managing the state of hosts is to apply cumulative changes over time — installing patches, upgrading software, changing configuration. This often means that the ideal current state of a server isn’t defined anywhere — there’s no known good state that you can use to build new servers. This approach also encourages applying live fixes to servers which, counterintuitively, increases the risk of failure. These servers suffer from configuration drift.

This approach might make sense if individual hosts are a scarce resource. In a cloud environment, where individual hosts are cheap to run and replace, immutability is a better option. Instead of managing hosts, you should build them using a base template that itself is version controlled. Rather than updating older hosts, you replace them with hosts you build from a new version of a base template.

8.4.3 Types of service artifacts

Many languages have their own packaging mechanism, and this heterogeneity makes deployment more complex when working with services written in different languages. Your deployment tools need to treat differently the interface that each deployment package provides to get it running on a server (or to stop it).

Better tooling can reduce these differences, but technology-specific artifacts tend to work at too low an abstraction level. They primarily focus on packaging code, rather than the broader nature of application requirements:

They lack a runtime environment. As you saw earlier, you needed to separately install other dependencies to run your service.
They don’t provide any form of resource management or isolation, which makes it challenging to adequately run multiple services on a single host.

Luckily, you’ve got a few options: operating system packages, server images, or containers (figure 8.13).

Figure 8.13 The structure of different service artifact types

Operating system packages

You could use the packaging format of your target operating system, such as apt or yum in Linux. This approach standardizes the installation of an artifact, regardless of contents, as you can use standard operating system tools to automate the installation process. When you start a new host, you can pull the appropriate version of your service package. In addition, packages can specify dependencies on other packages — for example, a Rails application might specify dependencies on common Linux packages, such as libxml, libmagic, or libssl.

The OS package approach has three weaknesses:

It adds a different infrastructure requirement: you’ll need to host and manage a package repository.
These packages are often tightly coupled to a particular operating system, reducing your flexibility in using different deployment targets.
The packages aren’t at quite the right level of abstraction, as you still need to execute them in a host environment.

Server images

In typical virtualized environments, each server you run is built from an image, or template. The instance template you built in section 8.3 is an example of a server image.

You can use this image itself as a deployment artifact. Rather than pulling a package onto a generic machine, you could instead bakea new image for each version of your service that you want to deploy. A typical bake process has four steps:

Select a template image as the basis for the new image.
Start a VM based on the template image.
Provision the new VM to the desired state.
Take a snapshot of the new VM and save it as a new image template.

You can try that out using Packer.

First, save the following configuration file as instance-template.json.

Listing 8.1 The instance-template.json file

{
  "variables": {    ①  
    "commit": "{{env `COMMIT`}}"
  },
  "builders":    ②  
  [
    {
      "type": "googlecompute",
      "project_id": "market-data-1",
      "source_image_family": "debian-9",
      "zone": "europe-west1-b",
      "image_name": "market-data-service-{{user `commit`}}",
      "image_description": "image built for market-data
➥-service {{user `commit`}}",
      "instance_name": "market-data-service-{{uuid}}",
      "machine_type": "n1-standard-1",
      "disk_type": "pd-ssd",
      "ssh_username": "debian",
      "startup_script_file": "startup-script.sh"
    }
  ]
}

Now, run the packer build command from within the chapter-8/market-data directory:

packer build 
-var "commit=`git rev-parse head`"     ①  
instance-template.json    ②

If you watch the console output, it’ll reflect the four steps I outlined above: using the GCE API, Packer will start an instance, run the startup script, and save the instance as a new template image, tagged with the source Git commit. You can use the Git commit to explicitly distinguish different versions of your code.

This approach builds an immutable, predictable, and self-contained artifact. This immutable server pattern, combined with a configuration tool like Packer, allows you to store a reproducible base state as code.

It has a few limitations:

Images are locked to one cloud provider, making them nontransferable to other providers as well as to developers who want to recreate the deployed artifact on their machines.
Image builds are often slow because of the lengthy time it takes to spin up a machine and take a snapshot.
It’s not easy for you to use for a multiple service-per-host model.

Containers

Instead of distributing entire machines, containerization tools, such as Docker or rkt, provide a more lightweight approach to encapsulating an application and its dependencies. You can run multiple containers on one machine, isolated from each other but with lower resource overhead than a virtual machine because they share the kernel of one operating system. They avoid the overhead of virtualizing the disk and guest operating system of each virtual machine.

Try a quick example using Docker. (You can find instructions for installing Docker on the Docker website: https://docs.docker.com/install/.) You build a Docker image from a Dockerfile. Add the following file to the chapter-8/market-data folder.

Listing 8.2 Dockerfile for market-data service

FROM python:3.6    ①  
ADD . /app    ②  
WORKDIR /app
RUN pip install -r requirements.txt    ③  
CMD ["gunicorn", "-c", "config.py", "app:app", "--bind"
➥, "0.0.0.0:8080"]    ④  
EXPOSE 8000    ⑤

Then, use the docker command-line tool to build the container:

$ docker build -t market-data:`git rev-parse head` .
Sending build context to Docker daemon 71.17 kB
Step 1/3 : FROM python:3.6
 ---> 74145628c331
Step 2/3 : ADD . /app
 ---> bb3608d5143f
Removing intermediate container 74c250f83f8c
Step 3/3 : WORKDIR /app
 ---> 7a595179cc39
Removing intermediate container 19d3bffa4d2a
Successfully built 7a595179cc39

This will build a container image and tag it with the name market-data:<commit ID>.

Now that you’ve built an image for the application, you can run it locally. Try it out:

$ docker run -d -p 8080:8080 market-data:`git rev-parse head`

You’ll see startup logs from gunicorn in your terminal. If you like, try to curl the service on port 8000. You probably noticed that startup and build time for the container was significantly faster than the virtual machines on GCE. This is one of the key benefits of using containers.

In a few short steps, you can run this container image on GCE. First, you need to push the image to a container registry. Luckily, GCE already provides one:

TAG="market-data:$(git rev-parse head)"
PROJECT_ID=<your-project-id>    ①  
docker tag $TAG eu.gcr.io/$PROJECT_ID/$TAG    ②  
gcloud docker -- push eu.gcr.io/$PROJECT_ID/$TAG    ③

This registry acts as an artifact repository where you can store your Docker images for later use. After the push has completed, start an instance running this container:

gcloud beta compute instances create-with-container 
  market-data-service-c 
  --container-image eu.gcr.io/$PROJECT_ID/$TAG    ①  
  --tags api-service

Success! You’ve deployed a container, and you’ve seen firsthand that it provides a more flexible — and easy-to-use — abstraction than a VM image.

As well as acting as a packaging mechanism, a container provides a runtime environment that isolates execution, effectively easing the operation of diverse containers on a single machine. This is compelling because it provides sane abstractions above individual hosts.

Unlike virtual machine images, container images are portable; you can run the same container on any infrastructure that supports the container runtime. This eases deployment in scenarios where multiple deployment targets are required, such as companies that run workloads in both cloud and on-premise environments. It also simplifies local development; running multiple containers on a typical developer machine is much more manageable than building and managing multiple virtual machines.

8.4.4 Configuration

The service’s configuration is likely to differ based on deployment environment (staging, dev, production, and so on). For that and other reasons, you can’t represent all elements of a service within an artifact:

You can’t distribute secrets or sensitive configuration data, such as database passwords, in clear text or source control. You may want to retain the ability to change them independently of a service deployment (for example, as part of automated credential rotation, or worse, in the event of a security breach).
Environment-specific configuration data, such as database URLs, log levels, or third-party service endpoints, will vary.

Figure 8.14 Service configuration that differ by environment

The third principle of The Twelve-Factor App manifesto (12factor.net) states that you should strictly separate deployment configuration from code and provide it as environment variables (figure 8.14). In practice, the deployment mechanism you choose will define how you store and provide environment-specific configuration. We recommend storing configuration in two places:

In source control, version-controlled alongside the service, for nonsensitive configuration (These are commonly stored in .env files.)
A separate, access-restricted “vault” for secret information (such as HashiCorp’s perfectly named www.vaultproject.io)

The process that starts a service artifact should pull this configuration and inject it into the application’s environment.

Unfortunately, managing configuration separately can increase risk, as people may make changes to production outside of your immutable artifacts, affecting the predictability of your deployments. You should err on the side of restraint and attempt to include as much configuration as possible within your artifacts and rely on the speed and robustness of your deployment pipeline for rapidly changing configuration.

8.5 Service to host models

In this section, we’ll review three common models for deploying services to underlying hosts: single service to host, multiple services to host, and container scheduling.

8.5.1 Single service to host

In earlier examples, we’ve used a one-to-one relationship between service and underlying host. This approach is easy to understand and provides a clear and explicit isolation between the resource needs and runtime of multiple services. Figure 8.15 illustrates this approach. Although the analogy is somewhat cruel, using this model lets you treat servers as cattle: indistinguishable units that you can start, stop, and destroy on command.

This model isn’t perfect. Sizing virtual instances appropriately for the needs of each service requires ongoing effort and evaluation. If you’re not running in the cloud, you may run into the limits of your data center or virtualization solution. And as we touched on earlier, virtual machine startup time is comparatively slow, often taking several minutes.

Figure 8.15 A single service to host model

8.5.2 Multiple static services per host

It’s possible to run multiple services per host (figure 8.16). In the static variant of this model, the allocation of services to hosts is manual and static; the service owner makes a conscious choice, predeployment, about where each service should be run.

At first glance, this approach might seem desirable. If obtaining new hosts is costly or hosts are scarce, then the easiest route to production would be to maximize usage of your existing, limited number of hosts.

But this approach has several weaknesses. It increases coupling between services: deploying multiple services to a host leads to coupling between services, eliminating your desire to release services independently. It also increases the complexity of dependency management: if one service needs package v1.1, but another needs v2.0, the difference is difficult to reconcile. It becomes unclear which service ownsthe deployment environment — and therefore which team has responsibility for managing that configuration.

Figure 8.16 A single virtual machine can potentially run multiple services.

This approach also leads to challenges in monitoring and scaling services independently. One noisy service on a box might adversely impact other services, and it can be difficult to monitor the resource usage (CPU, memory) of services independently.

8.5.3 Multiple scheduled services per host

It’d be even simpler if you could avoid thinking about the underlying hosts that run your services altogether and focus entirely on the unique runtime environment of each application. This was the initial promise of platform as a service (Paas) solutions, such as Heroku. A PaaS provides tools for deploying and running services with minimal operational configuration or exposure to underlying infrastructural resources. Although these platforms are easy to use, they often strike a difficult balance between automation and control — simplifying deployment but removing customization from the developer’s hands — as well as being highly vendor specific.

Containers provide a more elegant abstraction:

An engineer can define and distribute a holistic application artifact.
A virtual machine can run multiple individual containers, isolating them from each other.
Containers provide an operational API that you can automate using higher level tooling.

These three facets enable scheduling, or orchestration, of containers. A container scheduler is a software tool that abstracts away from underlying hosts by managing the execution of atomic, containerized applications across a shared pool of resources. Typically, a scheduler consists of a master node that distributes application workloads to a cluster of worker nodes. Developers, or a deployment automation tool, send instructions to this master node to perform container deployments. Figure 8.17 illustrates this setup.

Figure 8.17 A container scheduler executes containers across a cluster of nodes, balancing the resource needs of those nodes.

Advantages of a scheduling model

Unlike the multiple static services per host model, the allocation of services in a scheduler model is dynamic and depends on the resources (CPU, disk, or memory needs) defined for each application. This avoids the pitfalls of the static model, as the scheduler aims to continually optimize resource usage within the cluster of nodes, while the container model preserves service independence.

By using a scheduler as a deployment platform, a service developer can focus on the environment of their service in isolation from the underlying needs of machine configuration. Operations engineers can focus on running the underlying scheduler platform and defining common operational standards for running services.

Container schedulers are complex

Container schedulers such as Kubernetes are complex pieces of software and require significant expertise to operate, especially because the tools themselves are relatively new. We strongly recommend them as the ideal deployment platform for microservices, but only if you can use a managed scheduler (such as Google’s Kubernetes Engine) or have the operational resources to run it in-house. If not, the single service per host model, combined with container artifacts, is a great and flexible fallback.

8.6 Deploying services without downtime

So far, you’ve only deployed market-data once. But in a real application, you’ll be deploying services often. You need to be able to deploy new versions withoutdowntime to maintain overall application stability. Every service will rely on others to be up and running, so you also need to maximize the availability of every service.

Three common deployment patterns are available for zero-downtime deployments:

Rolling deploy — You progressively take old instances (version N) out of service while you bring up new instances (version N+1), ensuring that you maintain a minimum percentage of capacity during deployment.
Canaries — You add a single new instance¹ into service to test the reliability of version N+1 before continuing with a full rollout. This pattern provides an added measure of safety beyond a normal rolling deploy.

Blue-green deploys — You create a parallel group of services (the green set), running the new version of the code; you progressively shift requests away from the old version (the blue set). This can work better than canaries in scenarios where service consumers are highly sensitive to error rates and can’t accept the risk of an unhealthy canary.

All of these patterns are built on a single primitive operation. You’re taking an instance, moving it to a running state in an environment, and directing traffic toward it.

8.6.1 Canaries and rolling deploys on GCE

It’s always better when you can see things in action. You can deploy a new version of market-data to GCE. First, you’ll want to create a new instance template. You can use the container you built and pushed in section 8.4.3:

gcloud beta compute instance-templates create-with-container 
  market-data-service-template-2 
  --container-image eu.gcr.io/$PROJECT_ID/$TAG
  --tags=api-service

Then, initiate a canary update:

gcloud beta compute instance-groups managed rolling-action start-update 
  market-data-service-group 
  --version template=market-data-service-template 
  --canary-version template=market-data-service
➥-template-2,target-size=1     ①  
  --region europe-west1

GCE will add the canary instance to the group and the backend service to begin receiving requests (figure 8.18). It’ll take a few minutes to come up. You also can see this on the GCE console (figure 8.19; Compute Engine > Instance Groups).

If you’re happy, you can proceed with the rolling update:

gcloud beta compute instance-groups managed rolling-action start-update 
  market-data-service-group 
  --version template=market-data-service-template-2 
  --region europe-west1

The speed at which this update occurs depends on how much capacity you want to maintain during the rollout. You also can elect to surge beyond your current capacity during rollout to ensure the target number of instances is always maintained. Figure 8.20 illustrates the stages of a rollout across three instances.

Figure 8.18 You add a new canary to the group.

Figure 8.19 Your instance group contains your original instances plus a canary instance of a new version.

If you were unhappy, you could roll back the canary:

gcloud beta compute instance-groups managed rolling-action start-update 
  market-data-service-group 
  --version template=market-data-service-template     ①  
  --region europe-west1

The command for a rollback is identical to a rollout, but it goes to a previous version. In the real world, rollback may not be atomic. For example, the incorrect operation of new instances may have left data in an inconsistent state, requiring manual intervention and reconciliation. Releasing small change sets and actively monitoring release behavior will limit the occurrence and extent of these scenarios.

Figure 8.20 Stages of a rolling deploy, beginning with a canary instance

We’ve covered a lot of ground in this chapter: you’ve deployed manually to a cloud provider, packaged a service as a container and a virtual machine, and practiced safe rollout patterns. By building immutable service artifacts and performing safe, downtime-free deployments, you’re well on your way to building a deployment process that works reliably across multiple services. Ultimately, the more stable, reliable, and seamless your deployment process, the easier it is to standardize services, release new services more rapidly, and deliver valuable new features without friction or risk.

Summary

Deploying new applications and changes must be standardized and straightforward to avoid friction in microservice development.
Microservices can run anywhere, but ideal deployment platforms need to support a range of features, including security, configuration management, service discovery, and redundancy.
You deploy a typical service as a group of identical instances, connected by a load balancer.
Instance groups, load balancers, and health checks enable autohealing and autoscaling of deployed services.
Service artifacts must be immutable and predictable to minimize risk, reduce cognitive load, and simplify deployment abstractions.
You can package services as language-specific packages, OS packages, virtual machine templates, or container images.
Being able to add/remove a single instance of a microservice is a fundamental primitive operation that you can use to compose higher level deployment.
You can use canaries or blue-green deployments to reduce the impact of unexpected defects on availability.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.