Scaling, Sharding, and Replicating

We're now able to deploy our microservice almost anywhere with minimal effort. Let's take a look at how we can leverage this and scale our microservice to handle an intensive usage environment.

Before starting, let's see what each of the topics covered in this chapter mean. We'll start from the end. Replicating is the easiest one, and it means to replicate, or copy, your service. Basically, replicating a microservice means it is running multiple instances at the same time, usually on different locations.

Sharding is similar, but with a different purpose. When replicating, each replica can do the full service job. When sharing, each shard can do only part of the service and you need all shards to have your service online. This is a common practice on very large database servers.

Scaling is a common meaning to both replicating and sharding, as both of them allow you to scale your service. Scaling is the process of growing your microservice to handle more load or failure events.

Being able to deploy consistently is important. This gives you better confidence when developing and testing because you have a consistent base layout for your service to run on. Not just that, it also allows you to deploy to multiple locations faster.

This enables you to replicate your service across multiple locations. Replicating a microservice not only allows you to develop in parallel, enabling every developer to have an instance running on its own computer, but it also gives you several advantages in a production environment, such as:

Distribution: When your service is spread across geographic locations, being closer to every customer, reducing latency from every location
Fault tolerance: When your service has an outage or usage peak on specific instances and you're able to route customers to instances being less used
Zero downtime: When your service has enough replicas that even if a substantial part of your infrastructure is affected by an external incident, your service is still generally available

Distributing your service geographically puts it near your customers. Usually, you want to distribute it in different continents to avoid inter-continental latency. If your service is broadly used, you may need to have several instances per continent, perhaps one in every country.

Fault tolerance and zero downtime are related to each other. Being able to operate when instances of your service are faulty gives users the perception of no downtime. This is also very important when you want to make an upgrade to your service without bringing all the instances down. You may phase-in each instance to upgrade while routing your customers to other instances, keeping your global service online in a virtual sense.

In this chapter, we'll see how we can use Docker tools to replicate our service using Swarm. Later on, we'll see how easy it is to migrate our microservice to Kubernetes locally.

Table of Contents for Scaling, Sharding, and Replicating

Create new playlist

Sign In

Sign Up

Table of Contents for
Scaling, Sharding, and Replicating