B
NAVIGATING PACKETS

image

In this appendix, we’ll examine ways that packets can be represented. We’ll look at fully interpreted and hexadecimal representations of packets, as well as how to read and reference packet values using a packet diagram.

Because you’ll find a wealth of software that can interpret packet data for you, you could perform packet sniffing and analysis without understanding the information contained in this appendix. But, if you take the time to learn about packet data and how it’s structured, you’ll be in a much better position to understand what tools like Wireshark are showing you. The less abstraction between you and the data you’re analyzing, the better.

Packet Representation

There are many ways a packet can be represented for interpretation. Raw packet data can be represented as binary, a combination of 1s and 0s in base 2, like this:

0110000001010011010111000000101011000001000000000001000000000000001000110000010
110101011011100000000000000000000000000000000000000000010000001000000000001011
0110100000000000001000000110000000000000000000000010000000100000000010000000010

Binary numbers represent digital information at the lowest level possible, with a 1 representing the presence of an electrical signal and a 0 representing the absence of a signal. Each digit is a bit, and eight bits is a byte. However, binary data is difficult for humans to read and interpret, so we usually convert binary data to hexadecimal, a combination of letters and numbers in base 16. The same packet in hexadecimal looks like this:

4500 0034 40f2 4000 8006 535c ac10 1080
4a7d 5f68 0646 0050 7c23 5ab7 0000 0000
8002 2000 0b30 0000 0204 05b4 0103 0302
0101 0402

Hexadecimal (also referred to as hex) is a numbering system that uses the numbers 0 through 9 and letters A through F to represent values. It is one of the most common ways that packets are represented because it’s concise and can easily be converted to the even more fundamental binary interpretation. In hex, two characters represent a byte, which contains eight bits. Each character within a byte is a nibble (4 bits), with the leftmost value being the higher-order nibble and the rightmost value being the lower-order nibble. Using the example packet, this means that the first byte is 45, the higher-order nibble is 4, and the lower-order nibble is 5.

The position of bytes within a packet is represented using offset notation, starting from zero. Therefore, the first byte in the packet (45) is at position 0x00, the second byte (00) is at 0x01, and the third byte (00) is at 0x02, and so on. The 0x part is saying that hex notation is being used. When referencing a position spanning more than one byte, the number of additional bytes is indicated numerically after a colon. For example, to reference the position of the first four bytes in the example packet (4500 0034), you would use 0x00:4. This explanation will be important when we use packet diagrams to dissect unknown protocols in “Navigating a Mystery Packet” on page 330.

NOTE

The most common mistake I see people make when trying to dissect packets is forgetting to start counting from zero. This is very hard to get used to, since most people are taught to start counting from one. I’ve been slicing and dicing packets for years, and I still make this mistake. The best advice I can give here is don’t be afraid to count on your fingers. You might feel like it looks dumb, but there’s absolutely no shame in it, especially if it helps you arrive at the correct answer.

At a much higher level, a tool like Wireshark can represent a packet in a fully interpreted manner by using a protocol dissector, which we’ll discuss next. The same packet we just looked at is shown in Figure B-1, fully interpreted by Wireshark.

image

Figure B-1: A packet interpreted by Wireshark

Wireshark shows the information in a packet with labels that describe it. Packets don’t contain labels, but their data does map to a precise format specified by the protocol standard. Fully interpreting a packet means reading the data based on the protocol standard and dissecting it into labeled, human-friendly text.

Wireshark and similar tools are able to fully interpret packet data because they have protocol dissectors built into them that define the position, length, and values of each field within a protocol. For example, the packet in Figure B-1 is broken into sections based on the Transmission Control Protocol (TCP). Within TCP, there are labeled fields and values. Source Port is one label, and 1606 is its decimal value. This makes it easy to find the information you’re looking for when performing analysis. Whenever this option is available to you, it’s usually the most efficient way to get the job done.

Wireshark has thousands of dissectors, but you might encounter protocols that Wireshark doesn’t know how to interpret. This is often the case with vendor-specific protocols that aren’t widely used and custom malware protocols. When this happens, you’ll be left with only partially interpreted packets. This is why Wireshark provides the raw hexadecimal packet data at the bottom of the screen by default (see Figure B-1).

More commonly, command line programs like tcpdump that show raw hex don’t have nearly as many dissectors. This is especially true for more complex application-layer protocols, which are trickier to parse. Thus, encountering partially interpreted packets is the norm when using this tool. An example of using tcpdump is shown in Figure B-2.

When you are working with partially interpreted packets, you’ll have to rely on knowledge of packet structure at a more fundamental level. Wireshark, tcpdump, and most other tools enable this by showing the raw packet data in hex format.

image

Figure B-2: Partially interpreted packets from tcpdump

Using Packet Diagrams

As we learned in Chapter 1, a packet represents data that is formatted based on the rules of protocols. Because common protocols format packet data in a specific manner so that hardware and software can interpret this data, the packets must follow explicit formatting rules. We can identify this formatting and use it to interpret packet data by using packet diagrams. A packet diagram is a graphical representation of a packet that allows an analyst to map bytes within a packet to fields used by any given protocol. Derived from the protocol’s RFC specification document, it shows the fields present within the protocol, their length, and their order.

Let’s take another look at the example packet diagram for IPv4 we saw in Chapter 7 (provided here for your convenience as Figure B-3).

image

Figure B-3: A packet diagram for IPv4

In this diagram, the horizontal axis represents individual binary bits that are numbered from 0 to 31. The bits are grouped into 8-bit bytes that are numbered from 0 to 3. The vertical axis also is labeled according to bits and bytes, and each row is divided into 32-bit (or 4-byte) sections. We use the axes to count field positions using offset notation by first reading from the vertical axis to determine which 4-byte section the field resides in, and then counting off each byte in the section using the horizontal axis. The first row consists of the first four bytes, 0 through 3, which are labeled accordingly on the horizontal axis. The second row consists of the next four bytes, 4 through 7, which can also be counted off using the horizontal axis. Here we start with byte 4, which is byte 0 on the horizontal axis, then byte 5, which corresponds to byte 1 on the horizontal axis, and so on.

For example, we can determine that for IPv4, byte 0x01 is the Type of Service field, since we start at offset 0 and then count to byte 1. On the vertical axis, the first four bytes are in the first row, so we would then use the horizontal axis and start counting from 0 to byte 1. As another example, byte 0x08 is the Time to Live field. Using the vertical axis, we determine that byte 8 is in the third row down, which contains bytes 8 through 11. We then use the horizontal axis to count to byte 8 starting from 0. Since byte 8 is the first in the section, the horizontal axis column is just 0, which is the Time to Live field.

Some fields, such as the Source IP field, span multiple bytes, as we see in 0x12:4. Other fields are divided into nibbles. An example is 0x00, which contains the Version field in the higher-order nibble and the IP Header Length in the lower-order nibble. Byte 0x06 contains even more granularity, with individual bits used to represent specific fields. When a field is a single binary value, it is often referred to as a flag. Examples are the Reserved, Don’t Fragment, and More Fragments fields in the IPv4 header. A flag can only have a binary value of 1 (true) or 0 (false), so the flag is “set” when the value is 1. The exact implication of a flag setting will vary based on protocol and field.

Let’s look at another example in Figure B-4 (you may recognize this diagram from Chapter 8).

image

Figure B-4: A packet diagram for the TCP

This image shows the TCP header. Looking at this image, we can answer a lot of questions about a TCP packet without knowing exactly what TCP does. Consider an example TCP packet header represented in hex here:

0646 0050 7c23 5ab7 0000 0000 8002 2000
0b30 0000 0204 05b4 0103 0302 0101 0402

Using the packet diagram, we can locate and interpret specific fields. For example, we can determine the following:

•     The Source Port number is at 0x00:2 and has a hex value of 0646 (Decimal: 1606).

•     The Destination Port number is at 0x02:2 and has a hex value of 0050 (Decimal: 80).

•     The header length is in the Data Offset field at the higher-order nibble of 0x12 and has a hex value of 8.

Let’s apply this knowledge by dissecting a mystery packet.

Navigating a Mystery Packet

In Figure B-2, I showed you a packet that was only partially interpreted. You can ascertain through the interpreted portion of the data that this is a TCP/IP packet transmitted between two devices on the same network, but other than that, you don’t know much about the data being transmitted. Here’s the complete hex output of the packet:

4500 0034 8bfd 4000 8006 1068 c0a8 6e83
c0a8 6e8a 081a 01f6 41d2 eac6 e115 3ace
5018 fcc6 0032 0000 00d1 0000 0006 0103
0001 0001

A quick count finds that there are 52 bytes in this packet. The packet diagram for IP tells us that the normal size of the IP header is 20 bytes, which is confirmed by looking at the header size value in the lower-order nibble of 0x00. The diagram for the TCP header tells us that it is also 20 bytes if no additional options are present (there aren’t here, but we discuss TCP options in more depth in Chapter 8). This means that the first 40 bytes of this output are related to the TCP and IP data that has already been interpreted. This leaves the remaining 12 bytes uninterpreted.

00d1 0000 0006 0103 0001 0001

Without knowledge of how to navigate packets, this might leave you stumped, but you now know how to apply a packet diagram to the uninterpreted bytes. In this case, the interpreted TCP data tells us that the destination port for this data is 502. Reviewing the ports used by traffic isn’t a foolproof method for identifying uninterpreted bytes, but it’s a good place to start. A quick Google search reveals that port 502 is most commonly used for Modbus over TCP, which is a protocol used in Industrial Control System (ICS) networks. We can validate this is the case and navigate this packet by comparing the hex output to the packet diagram for Modbus, shown in Figure B-5.

image

Figure B-5: Packet diagram for Modbus over TCP

This packet diagram was created based on the information in the Modbus implementation guide: http://www.modbus.org/docs/Modbus_Messaging_Implementation_Guide_V1_0b.pdf. This tells us that there should be a 7-byte header that includes the Length field at 0x04:2 (relative to the start of the header). Counting to that position, we arrive at a hex value of 0006 (or a decimal value of 6), indicating there should be 6 bytes following that field, which is exactly the case. It appears that this is indeed Modbus over TCP data.

By comparing the packet diagram to the entirety of the hex output, the following information is derived:

•     The Transaction Identifier is at 0x00:2 and has a hex value of 00d1. This field is used to pair a request with a response.

•     The Protocol Identifier is at 0x02:2 and has a hex value of 0000. This identifies the protocol as Modbus.

•     The Length is at 0x04:2 and has a hex value of 0006. This defines the length of the packet data.

•     The Unit Identifier is at 0x06 and has a hex value of 01. This is used for intrasystem routing.

•     The Function Code is at 0x07 and has a hex value of 03. This is the Read Holding Registers function, which reads a data value from a system.

•     Based on the function code value of 3, two more data fields are expected. The Reference Number and Word Count are found at 0x08:4, and each has a hex value of 0001.

The mystery packet can now be fully explained in the context of the Modbus protocol. If you were troubleshooting the system responsible for this packet, this information should be all you need to proceed onward. Even if you never encounter Modbus, this is an example of how you can approach an unknown protocol and uninterpreted packet using a packet diagram.

It’s always best practice to be aware of the abstraction between yourself and the data being analyzed. This helps you make sounder and more knowledgeable decisions and allows you to work with packets in a variety of situations. I’ve found myself in many scenarios in which I’ve only been able to use command line–based tools such as tcpdump to analyze packets. Because most of these tools lack dissection for many layer 7 protocols, the ability to manually dissect specific bytes in these packets has been crucial.

NOTE

A colleague once had to help perform incident response in a highly secure environment. He was cleared to review the data he needed to look at, but not to access the specific system the data was stored on. The only thing they could do in the amount of time they had was print out the packets from specific conversations. Thanks to his fundamental knowledge of how packets are built and of how to navigate them, he was able to find the information he needed in the printed data. Of course, the process was slower than cold molasses running down a frozen branch. This is an extreme scenario, but it’s a prime example of why universal tool-agnostic knowledge is important.

For all of these reasons, it’s helpful to spend time breaking apart packets in order to gain experience viewing multiple interpretations. I do this enough that I’ve printed out several common packet diagrams, had them laminated, and keep them beside my desk. I also maintain a digital version on my laptop and tablet for quick reference when traveling. For convenience, I’ve included several common packet diagrams in the ZIP file containing the packet captures that goes along with this book (https://www.nostarch.com/packetanalysis3/).

Final Thoughts

In this appendix, we learned how to interpret packet data in a variety of formats and how to use packet diagrams to navigate uninterpreted packet data. Given this fundamental knowledge, you should have no trouble understanding how to dissect packets regardless of the tool you are using to view packet data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset