Load Balancers

A Load Balancer (LB) is a device which distributes incoming API requests to multiple computing resources (instances). It is able to do the following:

Handle more work than what a single instance is capable of
Increase reliability and availability through redundancy
Gracefully upgrade code in instances with zero downtime

LBs can work either at the Transport or Application layers of the networking stacks. The different levels are as follows:

HTTP: Here the LB routes HTTP requests to a set of backend instances. The routing logic is typically based on URLs but can also use other characteristics of the requests (User Agent, Headers, and so on). The LB typically sets some standard headers such as the X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Port headers to give the backends information about the original request.
HTTPS: HTTPS is very similar to HTTP, but the key difference is that, to be able to route requests on HTTP headers/components, SSL needs to be terminated. One the payload is decrypted and inspected for routing, the requests to the backend can either be encrypted through HTTPS or HTTP. This mandates that we must deploy an SSL/TLS certificate on the LB. The SSL and TLS protocols use X.509 certificates to authenticate both the client and the backend application. These certificates are a form of digital identity and are issued by a certificate authority (CA). The certificates typically contain information such as a public key for encryption, a validity period, a serial number, and the digital signature of the issuer. Managing certificates becomes a key maintenance activity for the SSL termination.
TCP: Traffic can also be routed at the TCP level. A typical example might be connecting to a set of redundant caches.
User Datagram Protocol (UDP): This is similar to TCP load balancing but more rarely used. Typical use cases might be protocols such as DNS and syslogd, which use UDP.

Depending on the level at which the LB is operating, different routing strategies are possible, including the following:

Direct routing: Essentially just change the L2 network address and redirect the packet to the backend. This has huge performance benefits as the LB does very little work and the return traffic (from the backend to the client) can happen without the LB in between. This however requires the LB and backend servers to be on the same L2 network
Network Address Translation (NAT): Here the L3 address are rewritten and the LB stores a mapping so that the return traffic can be redirected to the correct client.
Terminate and Connect: For most application-level load balancers, operating at the packet level is very difficult. So they terminate the TCP connection, read the data, store the payload across packets, and then, after enough information is available to make routing decisions, forwards the data to to a "backend" server via another set of sockets.

Each service (microservice) has a specific endpoint, typically a fully qualified domain name (FQDN). A Domain Name Server (DNS) identifies the IP address for this instance. Typically this IP address is a virtual IP address (VIP) and identifies a set of instances (each having an actual IP address). The application level (L7) LB is the place where one provisions the VIP for each service.

Load balancers should only forward traffic to healthy backend servers. To gauge health, typically backend services are expected to expose an endpoint which the LB can query for instance health.

A typical production setup consists of a combination of L4 (TCP) and L7 (HTTP) load balancers with a layer of machines terminating SSL, as described in the previous diagram.

Table of Contents for Load Balancers

Create new playlist

Sign In

Sign Up

Table of Contents for
Load Balancers