Peer-2-peer (P2P) architecture

In the P2P architecture model, all the actors are peers and there is no central coordinator between them. Each instance takes a part of the workload or shares its resources with other peers in the network. Generally, in this architecture, the nodes or instances act as both the servers and clients.

Though traditionally this paradigm implied equally privileged peers, in practice, there can be variations of this:

Hybrid: Some instances might have a specific functionality.
Structured P2P: Nodes might structure themselves into some sort of overlay and processing may flow like in the layered architecture pattern described previously. However, the key difference is that the overlay is temporal and disposable.

Bit-torrent is an example of a P2P architecture. It is a file-sharing server where each node streams whatever content it has and for new content, queries a central service (web page) to figure out which nodes host what chunks of the required content:

Source: http://www.bittorrent.org/

An example of a Structured P2P is a distributed hash table (DHT). If you recall, a hash table is a data structure that allows for the storage of Key-Value objects. Internally, the data is stored in various buckets. A good hash function is used to map the Key to a hash value and then perform a modulo operation with a set number of buckets to get to the bucket that contains the Key. Within a bucket, Keys are stored in a format amenable to searching (such as Red-Black trees), but each bucket has to work with a much lesser scale of data than the hash table as a whole.

The DHT allows for designing a hash table that is distributed across machines. A common need for DHT is when we want to do request-specific routing—here any instance can get a request, but will consult the DHT to figure out which instance to route the instance to in order to fulfil it. So, how do we build a DHT?

An initial (naive) solution might be to hash the key and do modulo-n to get a server address. For example, with three servers, six keys could be distributed as follows:

Key	Hash	Server at (Hash- mod- 3)
Alice	3333333333	0
Bob	7733228434	1
Chris	3734343434	2
Doug	6666666666	0
Elgar	3000034135	1
Fred	6000799124	3

This scheme is simple and works fine. However, the problem is with redistribution. One of the main requirements of distributed systems is scalability—the ability to add/remove servers to scale with load. Here, if we change the number of servers to four, then the hash values, and thus the server assignments of nearly all the keys, will change! This means a lot of wasteful data movement and downtime while the cluster reconfigures.

One scheme that overcomes this limitation is called consistent hashing, and was first described by Karger et al. at MIT in an academic paper from 1997. The basic idea behind the algorithm is to hash both the servers hosting the cache and the keys using the same hash function.

The reason to do this is to map the cache to an interval, which will contain a number of object hashes. If the cache is removed, its interval is taken over by a cache with an adjacent interval. All the other caches remain unchanged.

To understand how consistent hashing works, consider a circle with values on it ranging from [0-1], that is, any point on the circle has a value between 0 and 1. Next, we pick a favorite hashing function and also scale it from [0-1]. For example, if the hash function has a range from [0-X], we use the following:

    ringKey= hash(key) % X

Using this function, we can map machines (instances) and objects (using the keys) on the [0-1] range.

If we have three machines, we use the modified hash function to map each machine to a point on the circle:

Now, we can see that the 0-1 range has been split into intervals among the machines! Suppose we have a key-value pair in the hash table, we need to do two things:

Use the modified hash function to locate the key on the circle
Find the first machine that appears clockwise from that point and store the key there

This is demonstrated in the following diagram: KeyX maps to a point and the machine closest from the clockwise side in machine 3. Hence KeyX is assigned to machine 3:

From a programming perspective, the find closed machine clockwise is easily achieved by storing the point values of the machines in a fashion that is easy to find "the next higher number after y." One way is to use a linked list of machine hash values in sorted order. To find the assignment, just walk this list (or use binary search) to find the first machine with a hash value greater than, or equal to, the hash of the key. We can make this a circular list so that, if no machine with "larger key" is found, the computation wraps around, and the first server in the list is assigned.

Now, let's say we add one more machine to the cluster:

As you can see, most of the assignments are not affected by this change—in contrast to the naive hashing approach, where nearly every assignment changes. The only reassignment happens between the machine that was originally in the clockwise direction and the new one that was provisioned.

To smooth out irregularities in the distribution of the hash function, instead of one point on the ring, each machine is assigned a set of points (called vnodes). The following diagram depicts the scheme:

Courtesy of http://efcasado.github.io/riak-core_intro/

There have been recent improvements in consistent hashing to make it load-aware. One such algorithm is Consistent Hashing with Bounded Loads: https://research.googleblog.com/2017/04/consistent-hashing-with-bounded-loads.html.

It uses a uniformity parameter (ε) that is greater than 1 and controls how much imbalance there can be in terms of load (number of keys at each server). For example, if ε = 1.5, no server will get more than 150% of the average load. The algorithm's key modification for consistent hashing is to just move on to the next node on the ring if the closed node does not been the balancing factor. With large ε values, the algorithm becomes equivalent to original consistent hashing, however as it comes closer to 1, the distribution is more uniform but involves more rebalancing (fewer characteristics of consistent hashing). You can find the Go implementation of this paper at https://github.com/lafikl/consistent.

Sometimes, in P2P networks, the overlay can become hierarchical, with Superpeers:

These Superpeers are responsible for communication between the inner cluster; they also peer with other Superpeers to structure interactions. One example of this type of architecture is content-delivery networks (CDNs). Each edge server here is a peer.

Table of Contents for Peer-2-peer (P2P) architecture

Create new playlist

Sign In

Sign Up

Table of Contents for
Peer-2-peer (P2P) architecture