Chapter 9. Networking Foundations

When I started as an SRE, one of the topics that I found most daunting, and incredibly hard to find good writing on, was networking. Networking is one of the most integral parts of our industry, though, because it is how communication happens between services. Without networking, we would not have distributed systems or the internet. I personally wouldn't be able to save copies of this book into Dropbox, while flying between New York and San Francisco, nor would I be able to work regularly with my editors in the United Kingdom and India. However, despite modern networking being wonderful, it also needs to be resilient to deal with physical problems that constantly occur, such as cables getting cut, weather interfering with wireless, and sharks destroying oceanic cables.

As we have highlighted throughout the book, communication between people is very important. We're moving toward a world of more distributed systems, which means that consistent and reliable communication between services is more and more important. As we'll discuss later, networks are very finicky. In this chapter, I will show you how fragile the internet is and how impressive it is that it works at all. The chapter will also discuss how to debug and understand issues that might arise between two services.

The other goal of this chapter (and Chapter 10, Linux and Cloud Foundations) is to help you to pass an SRE interview. Often, employers ask system debugging questions or want to know how fundamental parts of a system might work. If you're coming from a more traditional programming role, you might not know any of these things or only have rough ideas. I hope to flesh that out or at least provide you with a direction for future research.

The internet

The internet is a constantly evolving system. When I say internet, I mean the global network of cables, Internet Service Providers (ISPs), routers, switches, and computers that let all of us communicate. There are cables on poles, underground, underwater, hanging between buildings, inside buildings, and really anywhere you can shove a cable.

The internet

Figure 1: A map of all submarine cables from submarinecablemap.com in 2018

Some countries have one ISP, while others have hundreds. In some countries, the government owns the cables, while in others they are owned by private businesses. Many places do not have any cables and receive all of their internet via wireless from long-distance towers or from satellites.

Often, when people try to draw a graph to describe the internet, they draw something looking like a complete bipartite graph, where every node connects to every other node. Instead, it's more like a city sewer system—lots of trees and graphs connecting to large pipes that connect to other large pipes. These pipes are all of varying lengths and at each intersection there is a computer deciding which pipes you should go down next.

This is all fraught, because each node just knows how to tell you to get to the next node. Most nodes do not know the health of other nodes or even whether the tunnels you will go down are intact. To deal with this, most networking software adds some overhead to verify that messages are received in full. The starting node adds a way for the final node to verify that each message is fully intact and whether there are any more messages after this one.

The internet

Figure 2: Here, we have a chaotic piece of lines and dots. Each dot is connected to other dots via lines and they often have to go through other dots. This image is purely random, but by viewing dots as computers and lines as cables, you have a very rough representation of a network.

This chaotic mesh of nodes is constantly changing and evolving. Things are removed, added, and changed every second, and the scale is such that it is often quite surprising that anything works. There is no true master architect, just a lot of people saying, "Okay, we'll all do it this way," and then people mostly following the rules.

I hope this gives you a sense of the chaos existing in a network. The rest of the chapter will cover technologies used to deal with this chaos and how to understand what is going on when two computers talk.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset