Scaling Applications

Scalability is the attribute of a software system that allows it to handle an increased amount of work with proportionally more resources, while still maintaining the service level agreements (SLAs) that the system offered. A scalable system allows you to solve an increased amount of traffic/work by throwing money at the problem; that is, by adding more hardware. A non-scalable system simply cannot handle the load, even with increased resources.

For example, consider a backend software service that provides an API that is useful for an app. But it is also important that the API returns data within a guaranteed amount of time so that users don't experience latency or unresponsiveness at the app. A system not designed with scalability in mind will behave as shown here:

With an increase in traffic, the response times go through the roof! In contrast, with a system that is designed to be scalable, the response times will be more-or-less the same. That is, they will exhibit the characteristic shown here:

There is a key difference between performance and scalability:

A system has a performance problem if it cannot meet the request of a single user with the needed SLA.
A system has a scalability problem if it's good for a single user but the SLAs are compromised with an increased number of concurrent users.

In this chapter, we will look at how scalability is impacted by things such as the following:

Algorithms
Data structures
Threading model
Local state

We shall also look at the following:

Bottlenecks
Different options on how systems can be scaled
Scaling deployments

Table of Contents for Scaling Applications

Create new playlist

Sign In

Sign Up

Table of Contents for
Scaling Applications