Topology

A distributed system consists of a bunch of services connected over a network. Each of the services has a specific purpose. Some of the services might be exposed for interactions with the clients (actors, in the use case parlance). Some services might just be hosting data and doing transformations for upstream services. The services communicate with each other to enable macro behavior and fulfil the requirements of the system.

The services interact with one another over the network using either of the following:

Application Programming Interface (API)
Messaging

Irrespective of the channel, the data is exchanged in a standardized format over the network.

The API paradigm is the most common. As described in Chapter 7, Building APIs, services communicate with each other over the network. They send requests and receive responses from specific endpoints. The most popular mechanism for engineering APIs is using the Hypertext Transfer Protocol (HTTP) and the Representational State Transfer (REST) standard. Multiple service instances are hosted behind a virtual IP (VIP) address by a load balancer (LB).

There are three downsides of this paradigm:

The communication is blocking.
The caller must know about the collie.
The one-to-many communication paradigm is not efficiently achievable.

The second communication paradigm is messaging. Here, services communicate with each other asynchronously using messages, generally through brokers. This paradigm is much more loosely-coupled and scalable, due to the following:

The message producers don't need to know about the consumers.
The consumers don't need to be up when the producers are producing the message.

However, this mode has its own set of complications: brokers become critical failure points for the system, and the communication is more difficult to change/extend compared to HTTP/JSON. Messaging is covered in detail in Chapter 6, Messaging.

A typical distributed system is depicted here:

Here's how this works:

There are four services. Service A, Service B, and Service C all serve requests from clients. These are behind a LB.
Service X handles requests from other services and is responsible for background tasks (such as sending an email to a customer).
Each service has more than one instance to enable redundancy and scalability.
Each service has its own database. Database sharing is an anti-pattern and introduces coupling between services.
Some of the database might be a replicated data store—so instead of one database instance to connect to, there are several.
The services communicate with each other asynchronously using Messaging.
The whole cluster is replicated in a remote datacenter so it enables business continuity in case of a datacenter-wide outage in the primary datacenter.

In this chapter, we will look at various aspects of building such systems. Let's start off by listing some not-so-obvious quirks of distributed systems.

Table of Contents for Topology

Create new playlist

Sign In

Sign Up

Table of Contents for
Topology