Going Distributed

Modern systems are rarely deployed on a single machine. With the availability of high-speed LAN interconnects, cloud-based pay-per-use environments, and microservices-based architectures, systems are increasingly composed on independent services, which are deployed on multiple computers. They work together to give a single coherent experience to the users.

Distributed architectures have two key ingredients:

Components: Modular units with well-defined interfaces (such as services and databases)
Interconnects: The communication links between the components (sometimes with the additional responsibility of mediation/coordination between components)

In the initial days of non-distributed computation, the components were hosted within a single process and components were essentially software modules that were orchestrated/initiated by a driver (Main) program. However, soon, systems began to outgrow a single machine and components that were hosted on different machines had to talk to each other. The interconnects started including network links:

This shift meant that programs had to make use of message-passing, instead of local same-memory-based communication, as a means of communication and synchronization.

Besides fulfilling the main requirements, every distributed system has a few generic goals:

Scalability: It should be easy to scale the resources allocated to the system as per demand.
Distributed transparency: The system should hide the fact that it is distributed and make clients transparent to where each service or resources lies.
Consistency: The clients should make the guarantees of the consistency offered by the system explicit. For example, is it guaranteed that a read after a write returns the last written value?
Using the right tool for a specific job: The services and interconnect that make up the distributed system should be extensible. It should be relatively easy to add a new service to the existing milieu, even though it is on a different operating system/programming language. Having the freedom of tech-stack heterogeneity is one key advantage of distributed systems.
Security: When computation happens on multiple machines and data flows through messages, it is important to consider the authentication/authorization and privacy implications of various choices.
Debuggability: It should be possible to debug and localize issues or problems in the services that make up the system. We should have the ability to trace user requests as they are fulfilled by various components in the system.

This chapter discusses what happens when we go from one machine to several. In the following sections, we will cover the following topics:

Topology: A top level overview of distributed systems
Quirks: Unique characteristics of distributed systems
Consistency: How consistency is achieved when data is distributed
Consensus: How multiple independent systems agree on something
Architecture pattern: Common design patterns in distributed systems

Table of Contents for Going Distributed

Create new playlist

Sign In

Sign Up

Table of Contents for
Going Distributed