Mature deployment practices are crucial to building reliable and stable microservices. Unlike a monolithic application, where you can optimize deployment for a single use case, microservice deployment practices need to scale to multiple services, written in different languages, each with their own dependencies. You need to be able to trust your deployment process to push out new features — and new services — without harming overall availability or introducing critical defects.
As a microservice application evolves at the level of deployable units, the cost of deploying new services must be negligible to enable engineers to rapidly innovate and deliver value to users. The added development speed you gain from microservices will be wasted if you can’t get them to production rapidly and reliably. Automated deployments are essential to developing microservices at scale.
In this chapter, we’ll explore the components of a microservice production environment. Following that, we’ll look at some deployment building blocks — such as artifacts and rolling updates — and how they apply to microservices. Throughout the chapter, we’ll work with a simple service — market-data — to try out different approaches to packaging and deployment using a well-known cloud service, Google Cloud Platform. You can find a starting point for this service in the book’s repository on Github (https://github.com/morganjbruce/microservices-in-action).
Deployment is the riskiest moment in the lifecycle of a software system. The closest real-world equivalent would be changing a tire — except the car is still moving at 100 miles an hour. No company is immune to this risk: for example, Google’s site reliability team identified that roughly 70% of outages are due to changes in a live system (https://landing.google.com/sre/book/chapters/introduction.html).
Microservices drastically increase the number of moving parts in a system, which increases the complexity of deployment. You’ll face four challenges when deploying microservices (figure 8.1):
When you do them well, deployments are based on simplicity and predictability. A consistent build pipeline produces predictable artifacts, which you can apply atomically to a production environment.
In an ideal world, deployment is “boring:” not unexciting, but incident-free. We’ve seen too many teams — both monolithic and microservice — that experience deploying software as incredibly stressful. But if working with microservices means you’re releasing more components more frequently, doesn’t that mean you’re introducing more risk and instability into a system?
Traditional change management methodologies attempt to reduce deployment risk by introducing governance and ceremony. Changes must go through numerous quality gates and formal approvals, usually human-driven. Although this is intended to ensure that only working code reaches production, this approach is costly to apply and doesn’t scale well to multiple services.
The larger a release, the higher the risk of introducing defects. Naturally, microservice releases are smaller because the codebases are smaller. And that’s the trick — by releasing smaller changes more often, you reduce the total impact of any single change. Rather than stopping everything for a deployment, you can design your services and deployment approaches with the expectation that they’ll face continuous change. Reducing the surface area of possible change leads to releases that are quicker, easier to monitor, and less disruptive to the smooth functioning of an application.
Even if your releases are smaller, you still need to make sure your change sets are as free from defects as possible. You can achieve this by automating the process of commit validation — unit tests, integration tests, linting, and so on — and the process of rollout — applying those changes in the production environment. This helps you to build systematic confidence in the code changes you’re making and apply consistent practices across multiple services.
Deployment is a combination of process and architecture:
Production environments for running microservices vary widely, as do monolith production environments. What’s appropriate for your application may depend on your organization’s existing infrastructure, technical capabilities, and attitude toward risk, as well as regulatory requirements.
The production environment for a microservice application needs to provide several capabilities to support the smooth operation of multiple services. Figure 8.2 gives a high-level view of the capabilities of the production environment.
A microservice production environment has six fundamental capabilities:
These components are part of the platform layer of the microservice architecture stack.
Along with the six fundamental features, two factors are key in assessing the suitability of a deployment platform for a microservice application:
Although you may not always have the luxury of choosing your deployment environment, it’s important to appreciate how different platforms might affect these characteristics and how you develop your microservice application. I once worked for a company that took six weeks to provision each new server. Suffice it to say that taking new services into production was an exhausting endeavor!
It’s not coincidental that the popularity of microservice architecture coincides with the wider adoption of DevOps practices, such as infrastructure as code, and the increasing use of cloud providers to run applications. These practices enable rapid iteration and deployment of services, which in turn makes a microservice architecture a scalable and feasible approach.
When possible you should aim to use a public infrastructure as a service (IaaS) cloud, such as Google Cloud Platform (GCP), AWS, or Microsoft Azure, for deploying any nontrivial microservice application. These cloud services offer a wide range of features and tools that ease the development of a robust microservice platform at a lower level of abstraction than a higher level deployment solution (such as Heroku). As such, they offer more flexibility. In the next section, we’ll show you how to use GCP to deploy, access, and scale a microservice.
It’s time to get your hands dirty and deploy a service. You need to take your code, get it running on a virtual machine, and make it accessible from the outside world — as figure 8.3 illustrates.
You’ll use Google Compute Engine (GCE) as a production environment. This is a service on GCP that you can use to run virtual machines. You can sign up for a free trial GCP subscription, which will have enough credit for this chapter’s examples. Although the operations you’ll perform are specific to this platform, all major cloud providers, such as AWS and Azure, provide similar abstractions.
To interact with GCE, you’ll use the gcloud
command-line tool. This tool interacts with the GCE API to perform operations on your cloud account. You can find install instructions in the GCP documentation (https://cloud.google.com/sdk/docs/quickstarts). It’s not the only option — you could use third-party tools like Ansible or Terraform instead.
Assuming you’ve followed the install instructions and logged in with gcloud init
, you can create a new project:
gcloud projects create <project-id> --set-as
➥-default --enable-cloud-apis ①
This project will contain the resources that’ll run your service.
To run your service, you’ll use a startup script, which will be executed at startup time when Google Cloud provisions your machine. We’ve written this for you already — you can find it at chapter-8/market-data/startup-script.sh.
Take your time to read through the script, which performs four key tasks:
You can provision a virtual machine from the command line. Change to the chapter-8/market-data directory and run the following command:
gcloud compute instances create market-data-service ①
--image-family=debian-9 ②
--image-project=debian-cloud ②
--machine-type=g1-small ③
--scopes userinfo-email,cloud-platform
--metadata-from-file startup-script=startup
➥-script.sh ④
--tags api-service ⑤
--zone=europe-west1-b ⑥
This will create a machine and return the machine’s external IP address — something like figure 8.4.
This approach to startup does take a while. If you want to watch the progress of the startup process, you can tail the output of the virtual machine’s serial port:
gcloud compute instances tail-serial-port-output market-data-service
Once the startup process has completed, you should see a message in the log, similar to this example:
Mar 16 12:17:14 market-data-service-1 systemd[1]: Startup finished in
➥ 1.880s (kernel) + 1min 52.486s (userspace) = 1min 54.367s.
Great! You’ve got a running service — although you can’t call it yet. You’ll need to open the firewall to make an external call to this service. Running the following command will open up public access to port 8080 for all services with the tag api-service
:
gcloud compute firewall-rules create default-allow-http-8080
--allow tcp:8080 ①
--source-ranges 0.0.0.0/0 ②
--target-tags api-service ③
--description "Allow port 8080 access to api-service"
You can test your service by curling the external IP of the virtual machine. The external IP was returned when you created the instance (figure 8.4). If you didn’t note it, you can retrieve all instances by running gcloud compute instances list
. Here’s the curl
:
curl -R http://<EXTERNAL-IP>:8080/ping ①
If all is going well, the response you get will be the name of the virtual machine — market
-data-service
.
It’s unlikely you’ll ever run a single instance of a microservice:
Figure 8.5 illustrates a service group. Requests made to the logical service, market-data, are load balanced to underlying market-data instances. This is a typical production configuration for a stateless microservice.
You can try this out. On GCE, a group of virtual machines is called an instance group (or on AWS, it’s an auto-scaling group). To create a group, you first need to create an instance template
:
gcloud compute instance-templates create market-data-service-template
--machine-type g1-small
--image-family debian-9
--image-project debian-cloud
--metadata-from-file startup-script=startup-script.sh
--tags api-service
--scopes userinfo-email,cloud-platform
Running this code will create a template to build multiple market-data-service instances like the one you built earlier. Once the template has been set up, create a group:
gcloud compute instance-groups managed create market-data-service-group
--base-instance-name market-data-service ①
--size 3 ②
--template market-data-service-template ③
--region europe-west1 ④
This will spin up three instances of your market-data service. If you open the Google Cloud console and navigate to Compute Engine > Instance Groups, you should see a list like the one in figure 8.6.
Using an instance template to build a group gives you some interesting capabilities out of the box: failure zones and self-healing. These two features are crucial to operating a resilient microservice.
First, note the zone column in figure 8.6. It lists three distinct values: europe-west1-d
, europe-west1-c
, and europe-west1-b
. Each of these zones represents a distinct data center. If one of those data centers fails, that failure will be isolated and will only affect 33% of your service capacity.
If you select one of those instances, you’ll see the option to delete that instance (figure 8.7). Give it a shot!
Deleting an instance will cause the instance group to spin up a replacement instance, ensuring that capacity is maintained. If you look at the operation history of the project (Compute Engine > Operations), you’ll see that the delete operation results in GCE automatically recreating the instance (figure 8.8).
The instance group will attempt to self-heal in response to any event that results in an instance falling out of service, such as underlying machine failure. You can improve this by adding a health check that also targets your application:
gcloud compute health-checks create http api-health-check
--port=8080 ①
--request-path="/ping" ①
gcloud beta compute instance-groups managed set
➥-autohealing ②
market-data-service-group ②
--region=europe-west1 ②
--http-health-check=api-health-check ②
Now, with the addition of the health check, whenever the application fails to reply to it, the virtual machine will be recycled.
As your service is now deployed from a template, it’s trivial to add more capacity. You can resize the group from the command line:
gcloud compute instance-groups managed resize market-data-service-group
--size=6 ①
--region=europe-west1
You also can add autoscaling rules to automatically add more capacity if metrics you observe from your group, such as average CPU utilization, pass a given threshold.
In all that excitement, you forgot to expose your service group to the wild! In this case, GCE will provide your load balancer, which consists of a few interconnected components, as outlined in figure 8.9. The load balancer uses these routing rules, proxies, and maps to forward requests from the outside world to a set of healthy service instances.
First, you’ll want to add a backend service, which is the most important component of your load balancer because it’s responsible for directing traffic optimally to underlying instances:
gcloud compute instance-groups managed set-named-ports
market-data-service-group
--named-ports http:8080
--region europe-west1
gcloud compute backend-services create
➥market-data-service ①
--protocol HTTP
--health-checks api-health-check ②
--global
This code creates two entities: a named part, identifying the port your service exposes, and a backend service, which uses the http health check you created earlier to test the health of your service.
Next, you need a URL map and a proxy:
gcloud compute url-maps create api-map
--default-service market-data-service ①
gcloud compute target-http-proxies create api-proxy
--url-map api-map ②
If you had more than one service, you could use the map to route different subdomains to different backends. In this case, the URL map will direct all requests, regardless of URL, to the market-data-service you created earlier.
Finally, you need to create a static IP address for your service and a forwarding rule that connects that IP to the HTTP proxy you’ve created:
gcloud compute addresses create market-data-service-ip
--ip-version=IPV4
--global
export IP=`gcloud compute addresses describe market
➥-data-service-ip --global --format json | jq –raw
➥-output '.address'` ①
gcloud compute forwarding-rules create
➥api-forwarding-rule ②
--address $IP ②
--global ②
--target-http-proxy api-proxy ②
--ports 80 ②
printenv IP ③
This code creates a public IP address and configures requests to that IP to be forwarded to your HTTP proxy and on to your backend service. Once run, these rules take several minutes to propagate. After a wait, try to curl the service — curl "http://$IP/ping?[1-100]".
That will start you with 100 requests. If you see the names of different market-data nodes being output to your terminal — terrific — you’ve deployed a load-balanced microservice!
In these examples, you’ve built some of the key elements of a microservice deployment process:
But a few things are missing. Your releases weren’t predictable, because you pulled your latest code and compiled it on the machine. A new code commit could cause different service instances to be running inconsistent versions of the code (figure 8.10). Without any explicit versioning or packaging, there would be no easy way to roll your code forward or back.
The process of starting machines was slow because you made pulling dependencies part of startup, rather than baking them into your instance template. This arrangement also meant that the dependencies could become inconsistent across different instances.
Lastly, you didn’t automate anything. Not only will a manual process not scale to multiple microservices, but it’s likely to be error prone. Over the next few sections and chapters, you can make this much better.
In the earlier deployment example, you didn’t package your code for deployment. The startup script that you ran on each node pulled code from a Git repository, installed some dependencies, and started your application. That worked, but it was flawed:
This made your deployment unpredictable — and fragile. To get the benefits you want, you need to build a service artifact. A service artifact is an immutable and deterministic package for your service. If you run the build process again for the same commit, it should result in an equivalent artifact.
Most technology stacks offer some sort of deployment artifact (for example, JAR files in Java, DLLs in .NET, gems in Ruby, and packages in Python). The runtime characteristics of these artifacts might differ. For instance, you need to run .NET web services using an IIS server whereas JARs may be self-executable, embedding a server process like Tomcat.
Figure 8.11 illustrates the artifact construction, storage, and deployment process. Typically, a build automation tool (such as Jenkins or CircleCI) builds a service artifact and pushes it to an artifact repository. An artifact repository might be a dedicated tool — for example, Docker provides a registry for storing images — or a generic file storage tool, such as Amazon S3.
A microservice isn’t only code; it’ll have many constituent parts:
Some of these dependencies, such as application libraries, are explicitly defined. Others may be implicit; for example, language-specific package managers are often ignorant of binary dependencies. Figure 8.12 illustrates these different parts.
An ideal deployment artifact for a microservice would allow you to package up a specific version of your compiled code, specifying any binary dependencies, and provide a standard operational abstraction for starting and stopping that service. This should be environment-agnostic: you should be able to run the same artifact locally, in test, and in production. By abstracting out differences between languages at runtime, you both reduce cognitive load and provide common abstractions for managing those services.
We’ve touched on immutability a few times so far — let’s take a moment to look at why it matters. An immutable artifact, encapsulating as many dependencies of your service as feasible, gives you the highest possible confidence that the package you tested throughout your deployment pipeline will be the same as what is deployed in production. Immutability also allows you to treat your service instances as disposable — if a service develops a problem, you can easily replace it with a new instance of the last known good state. On GCE, this autohealing process was automated by the instance group you created.
If a build of the same code can result in a different artifact being created — for example, pulling different versions of dependencies — you increase the risk in deployment and the fragility of your code because unintentional changes can be included in a release. Immutability increases the predictability of your system, as it’s easier to reason through a system’s state and recreate a historic state of your application — crucial for rollback.
Many languages have their own packaging mechanism, and this heterogeneity makes deployment more complex when working with services written in different languages. Your deployment tools need to treat differently the interface that each deployment package provides to get it running on a server (or to stop it).
Better tooling can reduce these differences, but technology-specific artifacts tend to work at too low an abstraction level. They primarily focus on packaging code, rather than the broader nature of application requirements:
Luckily, you’ve got a few options: operating system packages, server images, or containers (figure 8.13).
You could use the packaging format of your target operating system, such as apt
or yum
in Linux. This approach standardizes the installation of an artifact, regardless of contents, as you can use standard operating system tools to automate the installation process. When you start a new host, you can pull the appropriate version of your service package. In addition, packages can specify dependencies on other packages — for example, a Rails application might specify dependencies on common Linux packages, such as libxml, libmagic, or libssl.
The OS package approach has three weaknesses:
In typical virtualized environments, each server you run is built from an image, or template. The instance template you built in section 8.3 is an example of a server image.
You can use this image itself as a deployment artifact. Rather than pulling a package onto a generic machine, you could instead bakea new image for each version of your service that you want to deploy. A typical bake process has four steps:
You can try that out using Packer.
First, save the following configuration file as instance-template.json.
Listing 8.1 The instance-template.json file
{
"variables": { ①
"commit": "{{env `COMMIT`}}"
},
"builders": ②
[
{
"type": "googlecompute",
"project_id": "market-data-1",
"source_image_family": "debian-9",
"zone": "europe-west1-b",
"image_name": "market-data-service-{{user `commit`}}",
"image_description": "image built for market-data
➥-service {{user `commit`}}",
"instance_name": "market-data-service-{{uuid}}",
"machine_type": "n1-standard-1",
"disk_type": "pd-ssd",
"ssh_username": "debian",
"startup_script_file": "startup-script.sh"
}
]
}
Now, run the packer build
command from within the chapter-8/market-data directory:
packer build
-var "commit=`git rev-parse head`" ①
instance-template.json ②
If you watch the console output, it’ll reflect the four steps I outlined above: using the GCE API, Packer will start an instance, run the startup script, and save the instance as a new template image, tagged with the source Git commit. You can use the Git commit to explicitly distinguish different versions of your code.
This approach builds an immutable, predictable, and self-contained artifact. This immutable server pattern, combined with a configuration tool like Packer, allows you to store a reproducible base state as code.
It has a few limitations:
Instead of distributing entire machines, containerization tools, such as Docker or rkt, provide a more lightweight approach to encapsulating an application and its dependencies. You can run multiple containers on one machine, isolated from each other but with lower resource overhead than a virtual machine because they share the kernel of one operating system. They avoid the overhead of virtualizing the disk and guest operating system of each virtual machine.
Try a quick example using Docker. (You can find instructions for installing Docker on the Docker website: https://docs.docker.com/install/.) You build a Docker image from a Dockerfile. Add the following file to the chapter-8/market-data folder.
Listing 8.2 Dockerfile for market-data service
FROM python:3.6 ①
ADD . /app ②
WORKDIR /app
RUN pip install -r requirements.txt ③
CMD ["gunicorn", "-c", "config.py", "app:app", "--bind"
➥, "0.0.0.0:8080"] ④
EXPOSE 8000 ⑤
Then, use the docker
command-line tool to build the container:
$ docker build -t market-data:`git rev-parse head` .
Sending build context to Docker daemon 71.17 kB
Step 1/3 : FROM python:3.6
---> 74145628c331
Step 2/3 : ADD . /app
---> bb3608d5143f
Removing intermediate container 74c250f83f8c
Step 3/3 : WORKDIR /app
---> 7a595179cc39
Removing intermediate container 19d3bffa4d2a
Successfully built 7a595179cc39
This will build a container image and tag it with the name market-data:<commit ID>.
Now that you’ve built an image for the application, you can run it locally. Try it out:
$ docker run -d -p 8080:8080 market-data:`git rev-parse head`
You’ll see startup logs from gunicorn in your terminal. If you like, try to curl the service on port 8000. You probably noticed that startup and build time for the container was significantly faster than the virtual machines on GCE. This is one of the key benefits of using containers.
In a few short steps, you can run this container image on GCE. First, you need to push the image to a container registry. Luckily, GCE already provides one:
TAG="market-data:$(git rev-parse head)"
PROJECT_ID=<your-project-id> ①
docker tag $TAG eu.gcr.io/$PROJECT_ID/$TAG ②
gcloud docker -- push eu.gcr.io/$PROJECT_ID/$TAG ③
This registry acts as an artifact repository where you can store your Docker images for later use. After the push has completed, start an instance running this container:
gcloud beta compute instances create-with-container
market-data-service-c
--container-image eu.gcr.io/$PROJECT_ID/$TAG ①
--tags api-service
Success! You’ve deployed a container, and you’ve seen firsthand that it provides a more flexible — and easy-to-use — abstraction than a VM image.
As well as acting as a packaging mechanism, a container provides a runtime environment that isolates execution, effectively easing the operation of diverse containers on a single machine. This is compelling because it provides sane abstractions above individual hosts.
Unlike virtual machine images, container images are portable; you can run the same container on any infrastructure that supports the container runtime. This eases deployment in scenarios where multiple deployment targets are required, such as companies that run workloads in both cloud and on-premise environments. It also simplifies local development; running multiple containers on a typical developer machine is much more manageable than building and managing multiple virtual machines.
The service’s configuration is likely to differ based on deployment environment (staging, dev, production, and so on). For that and other reasons, you can’t represent all elements of a service within an artifact:
The third principle of The Twelve-Factor App manifesto (12factor.net) states that you should strictly separate deployment configuration from code and provide it as environment variables (figure 8.14). In practice, the deployment mechanism you choose will define how you store and provide environment-specific configuration. We recommend storing configuration in two places:
The process that starts a service artifact should pull this configuration and inject it into the application’s environment.
Unfortunately, managing configuration separately can increase risk, as people may make changes to production outside of your immutable artifacts, affecting the predictability of your deployments. You should err on the side of restraint and attempt to include as much configuration as possible within your artifacts and rely on the speed and robustness of your deployment pipeline for rapidly changing configuration.
In this section, we’ll review three common models for deploying services to underlying hosts: single service to host, multiple services to host, and container scheduling.
In earlier examples, we’ve used a one-to-one relationship between service and underlying host. This approach is easy to understand and provides a clear and explicit isolation between the resource needs and runtime of multiple services. Figure 8.15 illustrates this approach. Although the analogy is somewhat cruel, using this model lets you treat servers as cattle: indistinguishable units that you can start, stop, and destroy on command.
This model isn’t perfect. Sizing virtual instances appropriately for the needs of each service requires ongoing effort and evaluation. If you’re not running in the cloud, you may run into the limits of your data center or virtualization solution. And as we touched on earlier, virtual machine startup time is comparatively slow, often taking several minutes.
It’s possible to run multiple services per host (figure 8.16). In the static variant of this model, the allocation of services to hosts is manual and static; the service owner makes a conscious choice, predeployment, about where each service should be run.
At first glance, this approach might seem desirable. If obtaining new hosts is costly or hosts are scarce, then the easiest route to production would be to maximize usage of your existing, limited number of hosts.
But this approach has several weaknesses. It increases coupling between services: deploying multiple services to a host leads to coupling between services, eliminating your desire to release services independently. It also increases the complexity of dependency management: if one service needs package v1.1, but another needs v2.0, the difference is difficult to reconcile. It becomes unclear which service ownsthe deployment environment — and therefore which team has responsibility for managing that configuration.
This approach also leads to challenges in monitoring and scaling services independently. One noisy service on a box might adversely impact other services, and it can be difficult to monitor the resource usage (CPU, memory) of services independently.
It’d be even simpler if you could avoid thinking about the underlying hosts that run your services altogether and focus entirely on the unique runtime environment of each application. This was the initial promise of platform as a service (Paas) solutions, such as Heroku. A PaaS provides tools for deploying and running services with minimal operational configuration or exposure to underlying infrastructural resources. Although these platforms are easy to use, they often strike a difficult balance between automation and control — simplifying deployment but removing customization from the developer’s hands — as well as being highly vendor specific.
Containers provide a more elegant abstraction:
These three facets enable scheduling, or orchestration, of containers. A container scheduler is a software tool that abstracts away from underlying hosts by managing the execution of atomic, containerized applications across a shared pool of resources. Typically, a scheduler consists of a master node that distributes application workloads to a cluster of worker nodes. Developers, or a deployment automation tool, send instructions to this master node to perform container deployments. Figure 8.17 illustrates this setup.
Unlike the multiple static services per host model, the allocation of services in a scheduler model is dynamic and depends on the resources (CPU, disk, or memory needs) defined for each application. This avoids the pitfalls of the static model, as the scheduler aims to continually optimize resource usage within the cluster of nodes, while the container model preserves service independence.
By using a scheduler as a deployment platform, a service developer can focus on the environment of their service in isolation from the underlying needs of machine configuration. Operations engineers can focus on running the underlying scheduler platform and defining common operational standards for running services.
Container schedulers such as Kubernetes are complex pieces of software and require significant expertise to operate, especially because the tools themselves are relatively new. We strongly recommend them as the ideal deployment platform for microservices, but only if you can use a managed scheduler (such as Google’s Kubernetes Engine) or have the operational resources to run it in-house. If not, the single service per host model, combined with container artifacts, is a great and flexible fallback.
So far, you’ve only deployed market-data once. But in a real application, you’ll be deploying services often. You need to be able to deploy new versions withoutdowntime to maintain overall application stability. Every service will rely on others to be up and running, so you also need to maximize the availability of every service.
Three common deployment patterns are available for zero-downtime deployments:
All of these patterns are built on a single primitive operation. You’re taking an instance, moving it to a running state in an environment, and directing traffic toward it.
It’s always better when you can see things in action. You can deploy a new version of market-data to GCE. First, you’ll want to create a new instance template. You can use the container you built and pushed in section 8.4.3:
gcloud beta compute instance-templates create-with-container
market-data-service-template-2
--container-image eu.gcr.io/$PROJECT_ID/$TAG
--tags=api-service
Then, initiate a canary update:
gcloud beta compute instance-groups managed rolling-action start-update
market-data-service-group
--version template=market-data-service-template
--canary-version template=market-data-service
➥-template-2,target-size=1 ①
--region europe-west1
GCE will add the canary instance to the group and the backend service to begin receiving requests (figure 8.18). It’ll take a few minutes to come up. You also can see this on the GCE console (figure 8.19; Compute Engine > Instance Groups).
If you’re happy, you can proceed with the rolling update:
gcloud beta compute instance-groups managed rolling-action start-update
market-data-service-group
--version template=market-data-service-template-2
--region europe-west1
The speed at which this update occurs depends on how much capacity you want to maintain during the rollout. You also can elect to surge beyond your current capacity during rollout to ensure the target number of instances is always maintained. Figure 8.20 illustrates the stages of a rollout across three instances.
If you were unhappy, you could roll back the canary:
gcloud beta compute instance-groups managed rolling-action start-update
market-data-service-group
--version template=market-data-service-template ①
--region europe-west1
The command for a rollback is identical to a rollout, but it goes to a previous version. In the real world, rollback may not be atomic. For example, the incorrect operation of new instances may have left data in an inconsistent state, requiring manual intervention and reconciliation. Releasing small change sets and actively monitoring release behavior will limit the occurrence and extent of these scenarios.
We’ve covered a lot of ground in this chapter: you’ve deployed manually to a cloud provider, packaged a service as a container and a virtual machine, and practiced safe rollout patterns. By building immutable service artifacts and performing safe, downtime-free deployments, you’re well on your way to building a deployment process that works reliably across multiple services. Ultimately, the more stable, reliable, and seamless your deployment process, the easier it is to standardize services, release new services more rapidly, and deliver valuable new features without friction or risk.