Y-axis scaling

The objective of scaling along the y-axis is splitting the application into multiple, different services. Each service is responsible for one or more closely related functions. This relates to our microservices discussion, and is essentially a perfect deployment strategy for a service-oriented architecture. The benefit of this type of architecture is that hardware can be efficiently used for only those areas of the application that need it. For example, on a travel website, typically search would have a much higher traffic than booking (more people search more times, compared to booking). This means that we can dedicate more machines to search versus booking. We can also choose the right set of hardware for each microservice:

One thing to think about with this architecture is how to enable aggregated functionality. For example, on the typical Product Details pages of the travel website, the data may be served from various services such as these:

Product catalog: Mostly static information about the product such as hotel name, address
Pricing service: Product price
Availability service: Product inventory/availability
Reviews and ratings: Customer reviews, photos, and so on
Wallet: Shows details about the customers' rewards points that are applicable

When a client needs to compose behavior from such varied services, does it need to make n calls? There are few things to consider here:

The granularity of APIs provided by microservices is mostly different than what a client needs. And this granularity may change over time as the number of services (partitioning) changes. This refactoring should be hidden from clients.
Different clients need different data. For example, the browser version of the Hotel Details page will have a different layout, and thus information needs, as compared to the mobile version.
Network performance might be variable. A native mobile client uses a network that has very difference performance characteristics than a LAN used by a server-side web application. This difference manifests into different round-trip-times and variable latencies for the client. Based on the link, the API communication may be tuned (batched).

The solution to these issues is to implement an API gateway: an endpoint that clients calls which in turn handles the orchestration and composition of communication between services to get the clients what they need. Nginx is a popular high-performance webserver, and besides a host of configuration options, even has Lua scripting ability. All this enables a wide variety of use cases as a API gateway.

Another great example of an API gateway is the Netflix API gateway (http://techblog.netflix.com/2013/02/rxjava-netflix-api.html). The Netflix streaming service is used by hundreds of different kinds of devices, each with different requirements and characteristics. Initially, Netflix attempted to provide a single API for their streaming service. However, the company discovered that this does not scale because of the diversity of requests and the issues we have discussed. So, it pivoted to an API gateway that provides an API tailored for each device by running device‑specific adapter code. This code does a composition over six-to-seven backend services on average. The Netflix API gateway handles billions of requests per day. For more details visit, https://medium.com/netflix-techblog/embracing-the-differences-inside-the-netflix-api-redesign-15fd8b3dc49d:

Another interesting variant of this pattern is called Backend-For-Frontend. Many times, the device specifics that we saw previously quickly gain complexity, and it becomes difficult to engineer this in constrained environments such as Nginx (with Lua scripting). The solution is to have a specific backend service to serve as the API gateway for each type to client. The solution is described here:

Table of Contents for Y-axis scaling

Create new playlist

Sign In

Sign Up

Table of Contents for
Y-axis scaling