Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

15 Test-Driven Development for Networks

In the previous chapters, we were able to use Python to communicate with network devices, monitor and secure a network, automate processes, and extend an on-premises network to public cloud providers. We have come a long way from having to exclusively use a terminal window and manage the network with a CLI. When working together, the services we have built function like a well-oiled machine that gives us a beautiful, automated, programmable network. However, the network is never static and is constantly undergoing changes to meet the demands of the business. What happens when the services we build are not working optimally? As we have done with monitoring and source control systems, we are actively trying to detect faults.

In this chapter, we are extending the active detection concept with test-driven development (TDD). We will cover the following topics:

An overview of test-driven development
Topology as code
Writing tests for networking
pytest integration with Jenkins
pyATS and Genie

We'll begin this chapter with an overview of TDD before diving into its applications within networks. We will look at examples of using Python with TDD and gradually move from specific tests to larger network-based tests.

Test-driven development overview

The idea of TDD has been around for a while. American software engineer Kent Beck, among others, is typically credited with leading the TDD movement, along with agile software development. Agile software development requires very short build-test-deploy development cycles; all of the software requirements are turned into test cases. These test cases are usually written before the code is written, and the software code is only accepted when the test passes.

The same idea can be drawn in parallel with network engineering. For example, when we face the challenge of designing a modern network, we can break the process down into the following steps, from high-level design requirements to the network tests that we can deploy:

We start with the overall requirement for the new network. Why do we need to design a new network, or part of a new network? Maybe it is for new server hardware, a new storage network, or a new microservice software architecture.
The new requirements are broken down into smaller, more specific requirements. This could be evaluating a new switch platform, testing a possibly more efficient routing protocol, or a new network topology (for example, fat-tree). Each of the smaller requirements can be broken down into the categories of required or optional.
We draw out the test plan and evaluate it against the potential candidates for solutions.

The test plan will work in reverse order; we will start by testing the features, then integrate the new feature into a bigger topology. Finally, we will try to run our test as close to a production environment as possible.

What I am trying to get at is, even without realizing, we might already be adopting some of the TDD methodology in the normal network engineering process. This was part of my revelation when I was studying the TDD mindset. We are already implicitly following this best practice without formalizing the method.

By gradually moving parts of the network to code, we can use TDD for the network even more. If our network topology is described in a hierarchical format in XML or JSON, each of the components can be correctly mapped and expressed in the desired state, which some might call "the source of truth." This is the desired state that we can write test cases against to test production deviation from this state. For example, if our desired state calls for a full mesh of iBGP neighbors, we can always write a test case to check against our production devices for the number of iBGP neighbors it has.

The sequence of TDD is loosely based on the following six steps:

Write a test with the result in mind
Run all tests and see whether the new test fails
Write the code
Run the test again
Make the necessary changes if the test fails

Repeat

As with any process, how closely we follow the guideline is a judgment call. Personally, I prefer to treat these guidelines as goals and follow them somewhat loosely. For example, the TDD process calls for writing test cases before writing any code, or in our instance, before any components of the network are built. As a matter of personal preference, I always like to see a working version of the network or code before writing test cases. It gives me a higher level of confidence, so if anybody is judging my TDD process, I might just get a big fat "F." I also like to jump around between different levels of testing; sometimes I test a small portion of the network; other times I conduct a system-level end-to-end test, such as a ping or traceroute test.

The point is, I do not believe there is a one-size-fits-all approach when it comes to testing. It depends on personal preference and the scope of the project. This is true for most of the engineers I have worked with. It is a good idea to keep the framework in mind, so we have a working blueprint to follow, but you are the best judge of your style of problem-solving.

Before we delve further into TDD, let's cover some of the most common terminology in the following section so that we have a good conceptual grounding before getting into more detai ls.

Test definitions

Let's look at some of the terms commonly used in TDD:

Unit test: Checks a small piece of code. This is a test that is run against a single function or class.
Integration test: Checks multiple components of a code base; multiple units are combined and tested as a group. This can be a test that checks against a Python module or multiple modules.
System test: Checks from end to end. This is a test that runs as close to what an end user would see as possible.
Functional test: Checks against a single function.
Test coverage: A term defined as the determination of whether our test cases cover the application code. This is typically done by examining how much code is exercised when we run the test cases.
Test fixtures: A fixed state that forms a baseline for running our tests. The purpose of a test fixture is to ensure there is a well-known and fixed environment in which tests are run, so they are repeatable.
Setup and teardown: All the prerequisite steps are added in the setup and cleaned up in the teardown.

The terms might seem very software development-centric, and some might not be relevant to network engineering. Keep in mind that the terms are a way for us to communicate a concept or step. We will be using these terms in the rest of this chapter. As we use the terms more in the network engineering context, they might become clearer. With that covered, let's dive into treating network topology as code.

Topology as code

When we discuss topology as code, an engineer might jump up and declare: "The network is too complex, it is impossible to summarize it into code!" From personal experience, this has happened in some of the meetings I have been in. In the meeting, we would have a group of software engineers who want to treat infrastructure as code, but the traditional network engineers in the room would declare that it was impossible. Before you do the same and yell at me across the pages of this book, let's keep an open mind. Would it help if I tell you we have been using code to describe our topology in this book already?

If you take a look at any of the VIRL topology files that we have been using in this book, they are simply XML files that include a description of the relationship between nodes. For example, in this chapter, we will use the following topology for our lab:

Figure 1: The topology graph for our lab

If we open up the topology file, chapter15_topology.virl, with a text editor, we will see that the file is an XML file describing the node and the relationship between the nodes. At the top, or root, level is the <topology> node with child nodes of <node>. Each of the child nodes consists of various extensions and entries:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<topology xmlns="http://www.cisco.com/VIRL" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" schemaVersion="0.95" xsi:schemaLocation="http://www.cisco.com/VIRL https://raw.github.com/CiscoVIRL/schema/v0.95/virl.xsd">
<extensions>
<entry key="management_network" type="String">flat</entry>
</extensions>

The child node attributes are embedded with attributes such as name, type, and location. We can also see the configuration of each node in the text value of the <entry key="config"> element:

<node name="iosv-1" type="SIMPLE" subtype="IOSv" location="182,162" ipv4="192.168.0.3">
<extensions>
<entry key="static_ip" type="String">172.16.1.20</entry>
<entry key="config" type="string">
! IOS Config generated on 2018-07-24 00:23
! by autonetkit_0.24.0
!
hostname iosv-1
boot-start-marker
boot-end-marker
!
...
</node>
<node name="nx-osv-1" type="SIMPLE" subtype="NX-OSv" location="281,161" ipv4="192.168.0.1">
    <extensions>
        <entry key="static_ip" type="String">172.16.1.21</entry>
        <entry key="config" type="string">! NX-OSv Config generated on 2018-07-24 00:23
! by autonetkit_0.24.0
!
version 6.2(1)
license grace-period
!
hostname nx-osv-1

Even though the node is a host, we can also represent it in an XML element in the same file:

...
<node name="host2" type="SIMPLE" subtype="server" location="347,66">
    <extensions>
         <entry key="static_ip" type="String">172.16.1.23</entry>
         <entry key="config" type="string">#cloud-config
bootcmd:
ln -s -t /etc/rc.d /etc/rc.local
hostname: host2
manage_etc_hosts: true
runcmd:
start ttyS0
systemctl start [email protected]
systemctl start rc-local
<annotations/>
<connection dst="/virl:topology/virl:node[1]/virl:interface[1]" src="/virl:topology/virl:node[3]/virl:interface[1]"/>
<connection dst="/virl:topology/virl:node[2]/virl:interface[1]" src="/virl:topology/virl:node[1]/virl:interface[2]"/>
<connection dst="/virl:topology/virl:node[4]/virl:interface[1]" src="/virl:topology/virl:node[2]/virl:interface[2]"/>
</topology>

By expressing the network as code, we can declare a source of truth for our network. We can write test code to compare the actual production value against this blueprint. We will use this topology file as the base and compare the production network value against it.

We can use Python to extract the element from this topology file and store it as a Python data type so we can work with it. In chapter15_1_xml.py, we will use ElementTree to parse the virl topology file and construct a dictionary consisting of the information of our devices:

#!/usr/env/bin python3

import xml.etree.ElementTree as ET
import pprint

with open('chapter15_topology.virl', 'rt') as f: 
    tree = ET.parse(f)

devices = {}

for node in tree.findall('./{http://www.cisco.com/VIRL}node'):
    name = node.attrib.get('name')
    devices[name] = {}
    for attr_name, attr_value in sorted(node.attrib.items()):
        devices[name][attr_name] = attr_value

# Custom attributes
devices['iosv-1']['os'] = '15.6(3)M2'
devices['nx-osv-1']['os'] = '7.3(0)D1(1)'
devices['host1']['os'] = '16.04'
devices['host2']['os'] = '16.04'

pprint.pprint(devices)

The result is a Python dictionary that consists of the devices according to our topology file.

We can also add customary items to the dictionary:

(venv) $ python chapter15_1_xml.py
{'host1': {'location': '117,58',
           'name': 'host1',
           'os': '16.04',
           'subtype': 'server',
           'type': 'SIMPLE'},
 'host2': {'location': '347,66',
           'name': 'host2',
           'os': '16.04',
           'subtype': 'server',
           'type': 'SIMPLE'},
 'iosv-1': {'ipv4': '192.168.0.3',
            'location': '182,162',
            'name': 'iosv-1',
            'os': '15.6(3)M2',
            'subtype': 'IOSv',
            'type': 'SIMPLE'},
 'nx-osv-1': {'ipv4': '192.168.0.1',
              'location': '281,161',
              'name': 'nx-osv-1',
              'os': '7.3(0)D1(1)',
              'subtype': 'NX-OSv',
             'type': 'SIMPLE'}}

If we wanted to compare this "source of truth" to the production device version, we can use our script from Chapter 3, APIs and Intent-Driven Networking, cisco_nxapi_2.py, to retrieve the production NX-OSv device's software version. We can then compare the value we received from our topology file with the production device's information. Later, we can use Python's built-in unittest module to write test cases.

We will discuss the unittest module in just a bit. Feel free to skip ahead and come back to this example if you'd like.

Here is the relevant unittest code in chapter15_2_validation.py:

import unittest
<skip>
# Unittest Test case
class TestNXOSVersion(unittest.TestCase):
    def test_version(self):
        self.assertEqual(nxos_version, devices['nx-osv-1']['os'])

if __name__ == '__main__':
    unittest.main()

When we run the validation test, we can see that the test passes because the software version in production matches what we expected:

(venv) $ python chapter15_2_validation.py
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

If we manually change the expected NX-OSv version value to introduce a failure case, we will see the following failed output:

(venv) $ python chapter15_3_test_fail.py
F
======================================================================
FAIL: test_version (__main__.TestNXOSVersion)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "chapter15_3_test_fail.py", line 50, in test_version
    self.assertEqual(nxos_version, devices['nx-osv-1']['os'])
AssertionError: '7.3(0)D1(1)' != '7.4(0)D1(1)'
- 7.3(0)D1(1)
?   ^
+ 7.4(0)D1(1)
?   ^

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)

We can see that the test case result was returned as failed; the reason for failure was the version mismatch between the two values. As we saw in the last example, the Python unittest module is a great way to test our existing code based on our expected result. Let's take a deeper look at the module.

Python's unittest module

The Python standard library includes a module named unittest, which handles test cases where we can compare two values to determine whether a test passes or not. In the previous example, we saw how to use the assertEqual() method to compare two values to return either True or False. Here is an example, chapter15_4_unittest.py, that uses the built-in unittest module to compare two values:

#!/usr/bin/env python3

import unittest

class SimpleTest(unittest.TestCase):
    def test(self):
        one = 'a'
        two = 'a'
        self.assertEqual(one, two)

Using the python3 command line interface, the unittest module can automatically discover the test cases in the script:

(venv) $ python -m unittest chapter15_4_unittest.py
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

Besides comparing two values, here are more examples of testing whether the expected value is True or False. We can also generate custom failure messages when a failure occurs:

#!/usr/bin/env python3
# Examples from https://pymotw.com/3/unittest/index.html#module-unittest

import unittest

class Output(unittest.TestCase):
    def testPass(self):
        return

    def testFail(self):
        self.assertFalse(True, 'this is a failed message')

    def testError(self):
        raise RuntimeError('Test error!')

    def testAssesrtTrue(self):
        self.assertTrue(True)

    def testAssertFalse(self):
        self.assertFalse(False)

We can use -v for the option to display a more detailed output:

(venv) $ python -m unittest -v chapter15_5_more_unittest.py
testAssertFalse (chapter15_5_more_unittest.Output) ... ok
testAssesrtTrue (chapter15_5_more_unittest.Output) ... ok
testError (chapter15_5_more_unittest.Output) ... ERROR
testFail (chapter15_5_more_unittest.Output) ... FAIL
testPass (chapter15_5_more_unittest.Output) ... ok

======================================================================
ERROR: testError (chapter15_5_more_unittest.Output)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/echou/Mastering_Python_Networking_third_edition/Chapter15/chapter15_5_more_unittest.py", line 14, in testError
    raise RuntimeError('Test error!')
RuntimeError: Test error!

======================================================================
FAIL: testFail (chapter15_5_more_unittest.Output)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/echou/Mastering_Python_Networking_third_edition/Chapter15/chapter15_5_more_unittest.py", line 11, in testFail
    self.assertFalse(True, 'this is a failed message')
AssertionError: True is not false : this is a failed message

----------------------------------------------------------------------
Ran 5 tests in 0.001s

FAILED (failures=1, errors=1)

Starting from Python 3.3, the unittest module includes the mock object library by default (https://docs.python.org/3/library/unittest.mock.html). This is a very useful module that you can use to make a fake HTTP API call to a remote resource without actually making the call. For example, we have seen the example of using NX-API to retrieve the NX-OS version number. What if we want to run our test but we do not have an NX-OS device available? We can use the unittest mock object.

In chapter15_5_more_unittest_mocks.py, we created a class with a method to make HTTP API calls and expect a JSON response:

# Our class making API Call using requests
class MyClass:
    def fetch_json(self, url):
        response = requests.get(url)
        return response.json()

We also created a function that mocks two URL calls:

# This method will be used by the mock to replace requests.get
def mocked_requests_get(*args, **kwargs):
    class MockResponse:
        def __init__(self, json_data, status_code):
            self.json_data = json_data
            self.status_code = status_code

        def json(self):
            return self.json_data

    if args[0] == 'http://url-1.com/test.json':
        return MockResponse({"key1": "value1"}, 200)
    elif args[0] == 'http://url-2.com/test.json':
        return MockResponse({"key2": "value2"}, 200)

    return MockResponse(None, 404)

Finally, we make the API call to the two URLs in our test case. However, we are using the mock.patch decorator to intercept the API calls:

# Our test case class
class MyClassTestCase(unittest.TestCase):
    # We patch 'requests.get' with our own method. The mock object is passed in to our test case method.
    @mock.patch('requests.get', side_effect=mocked_requests_get)
    def test_fetch(self, mock_get):
        # Assert requests.get calls
        my_class = MyClass()
        # call to url-1
        json_data = my_class.fetch_json('http://url-1.com/test.json')
        self.assertEqual(json_data, {"key1": "value1"})
        # call to url-2
        json_data = my_class.fetch_json('http://url-2.com/test.json')
        self.assertEqual(json_data, {"key2": "value2"})
        # call to url-3 that we did not mock
        json_data = my_class.fetch_json('http://url-3.com/test.json')
        self.assertIsNone(json_data)

if __name__ == '__main__':
    unittest.main()

When we run the test, we will see that the test passes without needing to make an actual API call to the remote endpoint. Neat, huh?

(venv) $ python chapter15_5_more_unittest_mocks.py
.
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

For more information on the unittest module, Doug Hellmann's Python module of the week (https://pymotw.com/3/unittest/index.html#module-unittest) is an excellent source of short and precise examples on the unittest module. As always, the Python documentation is a good source of information as well: https://docs.python.org/3/library/unittest.html .

More on Python testing

In addition to the built-in unittest library, there are lots of other testing frameworks from the Python community. pytest is one of the most robust, intuitive Python testing frameworks and is worth a look. pytest can be used for all types and levels of software testing. It can be used by developers, QA engineers, individuals practicing TDD, and open source projects.

Many large-scale open source projects have switched from unittest or nose (another Python test framework) to pytest, including Mozilla and Dropbox. The attractive features of pytest include the third-party plugin model, a simple fixture model, and assert rewriting.

If you want to learn more about the pytest framework, I highly recommend Python Testing with pytest by Brian Okken (ISBN 978-1-68050-240-4). Another great source is the pytest documentation: https://docs.pytest.org/en/latest/.

pytest is command line-driven; it can find the tests we have written automatically and run them by appending the test prefix in our function. We will need to install pytest before we can use it:

(venv) $ pip install pytest
(venv) $ python
Python 3.6.8 (default, Oct  7 2019, 12:59:55)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pytest
>>> pytest.__version__
'5.2.2'

Let's look at s ome examples using pytest.

pytest examples

The first pytest example, chapter15_6_pytest_1.py, will be a simple assert for two values:

#!/usr/bin/env python3

def test_passing():
    assert(1, 2, 3) == (1, 2, 3)

def test_failing():
    assert(1, 2, 3) == (3, 2, 1)

When we run pytest with the -v option, pytest will give us a pretty robust answer for the reason for the failure. The verbose output is one of the reasons people like pytest:

(venv) $ pytest -v chapter15_6_pytest_1.py
=================================== test session starts ===================================
platform linux -- Python 3.6.8, pytest-5.2.2, py-1.8.0, pluggy-0.13.0 -- /home/echou/venv/bin/python3
cachedir: .pytest_cache
rootdir: /home/echou/Mastering_Python_Networking_third_edition/Chapter15
collected 2 items

chapter15_6_pytest_1.py::test_passing PASSED                                        [ 50%]

chapter15_6_pytest_1.py::test_failing FAILED                                        [100%]

======================================== FAILURES =========================================

______________________________________ test_failing _______________________________________
    def test_failing():
>       assert(1, 2, 3) == (3, 2, 1)
E       assert (1, 2, 3) == (3, 2, 1)
E         At index 0 diff: 1 != 3
E         Full diff:
E         - (1, 2, 3)
E         ?  ^     ^
E         + (3, 2, 1)
E         ?  ^     ^
chapter15_6_pytest_1.py:7: AssertionError
=============================== 1 failed, 1 passed in 0.03s ===============================

In the second pytest example, chapter15_7_pytest_2.py, we will create a router object. The router object will be initiated with some values in None and some values with default values. We will use pytest to test one instance with the default and one instance without:

#!/usr/bin/env python3

class router(object):
    def __init__(self, hostname=None, os=None, device_type='cisco_ios'):
        self.hostname = hostname
        self.os = os
        self.device_type = device_type
        self.interfaces = 24

def test_defaults():
    r1 = router()
    assert r1.hostname == None
    assert r1.os == None
    assert r1.device_type == 'cisco_ios'
    assert r1.interfaces == 24

def test_non_defaults():
    r2 = router(hostname='lax-r2', os='nxos', device_type='cisco_nxos')
    assert r2.hostname == 'lax-r2'
    assert r2.os == 'nxos'
    assert r2.device_type == 'cisco_nxos'
    assert r2.interfaces == 24

When we run the test, we will see whether the instance was accurately applied with the default values:

(venv) $ pytest chapter15_7_pytest_2.py
=================================== test session starts ===================================
platform linux -- Python 3.6.8, pytest-5.2.2, py-1.8.0, pluggy-0.13.0
rootdir: /home/echou/Mastering_Python_Networking_third_edition/Chapter15
collected 2 items
chapter15_7_pytest_2.py ..                                                          [100%]
==================================== 2 passed in 0.01s ====================================

If we were to replace the previous unittest example with pytest, in chapter15_8_pytest_3.py, we can see the syntax with pytest is simpler:

# pytest test case
def test_version():
    assert devices['nx-osv-1']['os'] ==  nxos_version

Then we run the test with the pytest command line:

(venv) $ pytest chapter15_8_pytest_3.py
=================================== test session starts ===================================
platform linux -- Python 3.6.8, pytest-5.2.2, py-1.8.0, pluggy-0.13.0
rootdir: /home/echou/Mastering_Python_Networking_third_edition/Chapter15
collected 1 item

chapter15_8_pytest_3.py .                                                           [100%]

==================================== 1 passed in 0.09s ====================================

Between unittest and pytest, I find pytest more intuitive to use. However, since unittest is included in the standard library, many teams might have a preference for using the unittest module for their testing.

Besides doing tests on code, we can also write tests to test our network as a whole. After all, users care more about their services and applications functioning properly and less about individual pieces. We will take a look at writing tests for the network in the next section.

Writing tests for networking

So far, we have been mostly writing tests for our Python code. We have used both the unittest and pytest libraries to assert True/False and equal/non-equal values. We were also able to write mocks to intercept our API calls when we do not have an actual API-capable device but still want to run our tests.

A few years ago, Matt Oswalt announced the Testing On Demand: Distributed (ToDD) validation tool for network changes. It is an open source framework aimed at testing network connectivity and distributed capacity. You can find more information about the project on its GitHub page: https://github.com/toddproject/todd. Oswalt also talked about the project on this Packet Pushers Priority Queue 81, Network Testing with ToDD: https://packethttps://packetpushers.net/podcast/podcasts/pqshow-81-network-testing-todd/.

In this section, let's look at how we can write tests that are relevant to the networking world. There is no shortage of commercial products when it comes to network monitoring and testing. Over the years, I have come across many of them. However, in this section, I prefer to use simple, open source tools for my tests.

Testing for reachability

Often, the first step of troubleshooting is to conduct a small reachability test. For network engineers, ping is our best friend when it comes to network reachability tests. It is a way to test the reachability of a host on an IP network by sending a small package across the network to the destination.

We can automate the ping test via the OS module or the subprocess module:

>>> import os
>>> host_list = ['www.cisco.com', 'www.google.com']
>>> for host in host_list:
...     os.system('ping -c 1 ' + host)
...
PING www.cisco.com(2001:559:19:289b::b33 (2001:559:19:289b::b33)) 56 data bytes
64 bytes from 2001:559:19:289b::b33 (2001:559:19:289b::b33): icmp_seq=1 ttl=60 time=11.3 ms

--- www.cisco.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 11.399/11.399/11.399/0.000 ms
0
PING www.google.com(sea15s11-in-x04.1e100.net (2607:f8b0:400a:808::2004)) 56 data bytes
64 bytes from sea15s11-in-x04.1e100.net (2607:f8b0:400a:808::2004): icmp_seq=1 ttl=54 time=10.8 ms

--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 10.858/10.858/10.858/0.000 ms
0

The subprocess module offers the additional benefit of catching the output:

>>> import subprocess
>>> for host in host_list:
...     print('host: ' + host)
...     p = subprocess.Popen(['ping', '-c', '1', host], stdout=subprocess.PIPE)
...
host: www.cisco.com
host: www.google.com
>>> print(p.communicate())
(b'PING www.google.com(sea15s11-in-x04.1e100.net (2607:f8b0:400a:808::2004)) 56 data bytes
64 bytes from sea15s11-in-x04.1e100.net (2607:f8b0:400a:808::2004): icmp_seq=1 ttl=54 time=16.9 ms

--- www.google.com ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 16.913/16.913/16.913/0.000 ms
', None)
>>>

These two modules prove to be very useful in many situations. Any command we can execute in the Linux and Unix environments can be executed via the OS or subprocess module.

Testing for network latency

The topic of network latency can sometimes be subjective. Working as a network engineer, we are often faced with the user saying that the network is slow. However, "slow" is a very subjective term.

If we could construct tests that turn subjective terms into objective values, it would be very helpful. We should do this consistently so that we can compare the values over a time series of data.

This can sometimes be difficult to do since the network is stateless by design. Just because one packet is successful does not guarantee success for the next packet. The best approach I have seen over the years is just to use ping across many hosts frequently and log the data, conducting a ping-mesh graph. We can leverage the same tools we used in the previous example, catch the return-result time, and keep a record. We do this in chapter15_10_ping.py:

#!/usr/bin/env python3

import subprocess

host_list = ['www.cisco.com', 'www.google.com']

ping_time = []

for host in host_list:
    p = subprocess.Popen(['ping', '-c', '1', host], stdout=subprocess.PIPE)
    result = p.communicate()[0]
    host = result.split()[1]
    time = result.split()[13]
    ping_time.append((host, time))
print(ping_time)

In this case, the result is kept in a tuple and put into a list:

(venv) $ python chapter15_10_ping.py
[(b'www.cisco.com(2001:559:19:289b::b33', b'time=16.0'), (b'www.google.com(sea15s11-in-x04.1e100.net', b'time=11.4')]

This is by no means perfect and is merely a starting point for monitoring and troubleshooting. However, in the absence of other tools, this offers some baseline of objective values.

Testing for security

We saw one of the best tools for security testing in Chapter 6, Network Security with Python, which was Scapy. There are lots of open source tools for security, but none offers the flexibility that comes with constructing our packets.

Another great tool for network security testing is hping3 (https://docs.python-cerberus.org/en/stable/). It offers a simple way to generate a lot of packets at once. For example, you can use the following one-liner to generate a TCP Syn flood:

# DON'T DO THIS IN PRODUCTION #
echou@ubuntu:/var/log$ sudo hping3 -S -p 80 --flood 192.168.1.202
HPING 192.168.1.202 (eth0 192.168.1.202): S set, 40 headers + 0 data bytes hping in flood mode, no replies will be shown
^C
--- 192.168.1.202 hping statistic ---
2281304 packets transmitted, 0 packets received, 100% packet loss round-trip min/avg/max = 0.0/0.0/0.0 ms
echou@ubuntu:/var/log$

Again, since this is a command line tool, we can use the subprocess m odule to automate any hping3 tests that we want.

Testing for transactions

The network is a crucial part of the infrastructure, but it is only a part of it. What the users care about is often the service that runs on top of the network. If the user is trying to watch a YouTube video or listen to a podcast but cannot, in their opinion, the service is broken. We might know that the network transport is not at fault, but that doesn't comfort the user.

For this reason, we should implement tests that are as similar to the user's experience as possible. In the example of a YouTube video, we might not be able to duplicate the YouTube experience 100% (unless you are part of Google), but we can implement a layer-seven service as close to the network edge as possible. We can then simulate the transaction from a client at a regular interval as a transactional test.

The Python HTTP standard library module is a module that I often use when I need to quickly test layer-seven reachability on a web service. We already saw how to use it when we were performing network monitoring in Chapter 5, The Python Automation Framework – Beyond Basics, but it's worth seeing again:

# Python 3
(venv) $ python3 -m http.server 8080
Serving HTTP on 0.0.0.0 port 8080 ...
127.0.0.1 - - [25/Jul/2018 10:15:23] "GET / HTTP/1.1" 200 -

If we can simulate a full transaction for the expected service, that is even better. But Python's simple HTTP server module in the standard library is always a great one for running some ad hoc web service tests.

Testing for network configuration

In my opinion, the best test for network configuration is using standardized templates to generate the configuration and back up the production configuration often. We have seen how we can use the Jinja2 template to standardize our configuration per device type or role. This will eliminate many of the mistakes caused by human error, such as copy and paste.

Once the configuration is generated, we can write tests against the configuration for known characteristics that we would expect before we push the configuration to production devices. For example, there should be no overlap of IP addresses in all of the network when it comes to loopback IP, so we can write a test to see whether the new configuration contains a loopback IP that is unique across our devices.

Testing for Ansible

For the time I have been using Ansible, I cannot recall using a unittest-like tool to test a Playbook. For the most part, the Playbooks use modules that were tested by module developers.

If you want a lightweight data validation tool, please check out Cerberus (https://docs.python-cerberus.org/en/stable/).

Ansible provides unit tests for their library of modules. Unit tests in Ansible are currently the only way to drive tests from Python within Ansible's continuous-integration process. The unit tests that are run today can be found under /test/units (https://github.com/ansible/ansible/tree/devel/test/units).

The Ansible testing strategy can be found in the following documents:

Testing Ansible: https://docs.ansible.com/ansible/2.5/dev_guide/testing.html
Unit tests: https://docs.ansible.com/ansible/2.5/dev_guide/testing_units.html
Unit testing Ansible modules: ansible.com/ansible/2.5/dev_guide/testing_units_modules.html

One of the interesting Ansible testing frameworks is Molecule (https://pypi.org/project/molecule/2.16.0/). It intends to aid in the development and testing of Ansible roles. Molecule provides support for testing with multiple instances, operating systems, and distributions. I have not used this tool, but it is where I would start if I wanted to perform more testing on my Ansible roles.

We should now know how to write tests for our network, whether testing for reachability, latency, security, transaction, or network configuration. Can we integrate testing with a source control tool such as Jenkins? The answer is yes. We will take a look at how to do so in the next section.

pytest Integration with Jenkins

Continuous Integration (CI) systems, such as Jenkins, are frequently used to launch tests after each of the code commits. This is one of the major benefits of using a CI system.

Imagine that there is an invisible engineer who is always watching for any changes in the network; upon detecting a change, the engineer will faithfully test a bunch of functions to make sure that nothing breaks. Who wouldn't want that?

Let's lo ok at an example of integrating pytest into Jenkins tasks.

Jenkins integration

Before we can insert the test cases into our CI, let's install some of the plugins that can help us visualize the operation. The two plugins we will install are build-name-setter and Test Results Analyzer:

Figure 2: Jenkins plugin installation

The test we will run will reach out to the NX-OS device and retrieve the operating system version number. This will ensure that we have API reachability to the Nexus device. The full script content can be read in chapter15_9_pytest_4.py. The relevant pytest portion and result are as follows:

def test_transaction():
assert nxos_version != False

(venv) $ pytest chapter15_9_pytest_4.py
=================================== test session starts ===================================
platform linux -- Python 3.6.8, pytest-5.2.2, py-1.8.0, pluggy-0.13.0
rootdir: /home/echou/Mastering_Python_Networking_third_edition/Chapter15
collected 1 item

chapter15_9_pytest_4.py .                                                           [100%]

==================================== 1 passed in 0.10s ====================================

We will use the --junit-xml=results.xml option to produce the file Jenkins needs:

(venv) $ pytest --junit-xml=result.xml chapter15_9_pytest_4.py

=================================== test session starts ===================================
platform linux -- Python 3.6.8, pytest-5.2.2, py-1.8.0, pluggy-0.13.0
rootdir: /home/echou/Mastering_Python_Networking_third_edition/Chapter15
collected 1 item

chapter15_9_pytest_4.py .                                                           [100%]
- generated xml file: /home/echou/Mastering_Python_Networking_third_edition/Chapter15/result.xml -

==================================== 1 passed in 0.10s ====================================

The next step is to check this script into the GitHub repository. I prefer to put the test in its own directory. Therefore, I created a /tests directory and put the test file there:

Figure 3: Project repository

We will create a new project named chapter15_example1:

Figure 4: Name your project in Jenkins

We can copy over the previous task, so we do not need to repeat all the steps:

Figure 5: Use the copy from function in Jenkins

In the execute shell section, we will add the pytest step:

Figure 6: The execute shell

We will add a post-build step of Publish JUnit test result report:

Figure 7: Post-build step

We will specify the results.xml file as the JUnit result file:

Figure 8: Test report XML location

After we run the build a few times, we will be able to see the Test Results Analyzer graph:

Figure 9: Test Results Analyzer graph in Jenkins

The test result can also be seen on the project home page. Let's introduce a test failure by shutting down the management interface of the Nexus device. If there is a test failure, we will be able to see it right away on the Test Result Trend graph on the project dashboard:

Figure 10: Test Result Trend graph in Jenkins

This is a simple but complete example. We can use the same pattern to build other integrated tests in Jenkins.

In the next section, we will take a look at an extensive testing framework developed by Cisco (and recently released as open source) called pyATS. Much to their credit, releasing such an extensive framework as open source for the benefit of the community was a great gesture by Cisco.

pyATS and Genie

pyATS (https://developer.cisco.com/pyats/) is an open source, end-to-end testing ecosystem originally developed by Cisco and made available to public in late 2017. The pyATS library was formerly known as Genie; many times they will be referred to in the same context. Because of its roots, the framework is very focused on network testing.

pyATS and the pyATS library (also known as Genie) was the winner of the 2018 Cisco Pioneer Award. We should all applaud Cisco for making the framework open source and available to the public. Good job, Cisco DevNet!

The framework is available on PyPI:

(venv) echou@network-dev-2:~$ pip install pyats

To get started, we can take a look at some of the example scripts on the GitHub repository, https://github.com/CiscoDevNet/pyats-sample-scripts. The tests start with creating a testbed file in YAML format. We will create a simple chapter15_pyats_testbed_1.yml testbed file for our iovs-1 device. The file should look similar to the Ansible inventory file that we have seen before:

testbed:
    name: Chapter_15_pyATS
    tacacs:
      username: cisco
    passwords:
      tacacs: cisco
      enable: cisco

devices:
   iosv-1:
       alias: iosv-1
       type: ios
       connections:
         defaults:
           class: unicon.Unicon
         management:
           ip: 172.16.1.20
           protocol: ssh

topology:
    iosv-1:
        interfaces:
            GigabitEthernet0/2:
                ipv4: 10.0.0.5/30
                link: link-1
                type: ethernet
            Loopback0:
                ipv4: 192.168.0.3/32
                link: iosv-1_Loopback0
                type: loopback

In our first script, chapter15_11_pyats_1.py, we will load the testbed file, connect to the device, issue a show version command, then disconnect from the device:

from pyats.topology import loader

testbed = loader.load('chapter15_pyats_testbed_1.yml')

testbed.devices
ios_1 = testbed.devices['iosv-1']

ios_1.connect()

print(ios_1.execute('show version'))

ios_1.disconnect()

When we execute the command, we can see the output is a mixture of the pyATS setup as well as the actual output of the device. This is similar to the Paramiko scripts we have seen before, but note that pyATS took care of the underlying connection for us:

(venv) $ python chapter15_11_pyats_1.py
[2019-11-10 08:11:55,901] +++ iosv-1 logfile /tmp/iosv-1-default-20191110T081155900.log +++
[2019-11-10 08:11:55,901] +++ Unicon plugin generic +++
<skip>
[2019-11-10 08:11:56,249] +++ connection to spawn: ssh -l cisco 172.16.1.20, id: 140357742103464 +++
[2019-11-10 08:11:56,250] connection to iosv-1
[2019-11-10 08:11:56,314] +++ initializing handle +++
[2019-11-10 08:11:56,315] +++ iosv-1: executing command 'term length 0' +++
term length 0
iosv-1#
[2019-11-10 08:11:56,354] +++ iosv-1: executing command 'term width 0' +++
term width 0
iosv-1#
[2019-11-10 08:11:56,386] +++ iosv-1: executing command 'show version' +++
show version
<skip>

In the second example, we will see a full example of connection setup, test cases, then connection teardown. First, we will add the nxosv-1 device to our testbed in chapter15_pyats_testbed_2.yml. The additional device is needed as the connected device to iosv-1 for our ping test:

    nxosv-1:
        alias: nxosv-1
        type: ios
        connections:
          defaults:
            class: unicon.Unicon
          vty:
            ip: 172.16.1.21
            protocol: ssh

In chapter15_12_pyats_2.py, we will use the aest module from pyATS with various decorators. Besides setup and cleanup, the ping test is in the PingTestCase class:

@aetest.loop(device = ('ios1',))
class PingTestcase(aetest.Testcase):

    @aetest.test.loop(destination = ('10.0.0.5', '10.0.0.6'))
    def ping(self, device, destination):
        try:
            result = self.parameters[device].ping(destination)

It is best practice to reference the testbed file at the command line during runtime:

(venv) $ python chapter15_12_pyats_2.py --testbed chapter15_pyats_testbed_2.yml

The output is similar to our first example, with the additions of STEPS Report and Detailed Results with each test case. The output also indicates the log filename that is written to the /tmp directory:

2019-11-10T08:23:08: %AETEST-INFO: Starting common setup
<skip>
2019-11-10T08:23:22: %AETEST-INFO: +----------------------------------------------------------+
2019-11-10T08:23:22: %AETEST-INFO: |                       STEPS Report                       |
2019-11-10T08:23:22: %AETEST-INFO: +----------------------------------------------------------+
<skip>
2019-11-10T08:23:22: %AETEST-INFO: +------------------------------------------------------------------------------+
2019-11-10T08:23:22: %AETEST-INFO: |                               Detailed Results                               |
2019-11-10T08:23:22: %AETEST-INFO: +------------------------------------------------------------------------------+
2019-11-10T08:23:22: %AETEST-INFO:  SECTIONS/TESTCASES  RESULT
2019-11-10T08:23:22: %AETEST-INFO: +------------------------------------------------------------------------------+
2019-11-10T08:23:22: %AETEST-INFO: |                    Summary|
2019-11-10T08:23:22: %AETEST-INFO: +------------------------------------------------------------------------------+
<skip>
2019-11-10T08:23:22: %AETEST-INFO:  Number of PASSED                                                             3

The pyATS framework is a great framework for automated testing. However, because of its origin, the support for vendors outside of Cisco is a bit lacking.

Another open source tool for network validation is Batfish, https://github.com/batfish/batfish, from the folks at IntentionNet. A primary use case for Batfish is to validate configuration changes before deployment.

There is a bit of a learning curve involved with pytest; it basically has its own way of performing tests that takes some getting used to. Understandably so, it is also heavily focused on Cisco platforms in its current iteration. But since this is now an open source project, we are all encouraged to make contributions if we would like to add other vendor support or make syntax or process changes. We are near the end of the chapter, so let's go over what we have done in this chapter.

Summary

In this chapter, we looked at test-driven development and how it can be applied to network engineering. We started with an overview of TDD; then we looked at examples of using the unittest and pytest Python modules. Python and simple Linux command line tools can be used to construct various tests for network reachability, configuration, and security.

We also looked at how we can utilize testing in Jenkins, a CI tool. By integrating tests into our CI tool, we can gain more confidence in the sanity of our changes. At the very least, we hope to catch any errors before our users do. pyATS is an open source tool that Cisco recently released. It is a network-centric automated testing framework that we can leverage.

Simply put, if it is not tested, it is not trusted. Everything in our network should be programmatically tested as much as possible. As with many software concepts, TDD is a never-ending service wheel. We strive to have as much test coverage as possible, but even at 100% test coverage, we can always find new ways and test cases to implement. This is especially true in networking, where the network is often the internet, and 100% test coverage of the internet is just not possible.

We are at the end of the book, I hope you have found the book as a joy to read as it was a joy for me to write. I want to say a sincere 'Thank You' for spending time with this book. I wish you success and happiness on your Python network journey!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Test-Driven Development for Networks

Create new playlist

Sign In

Sign Up

15

Test-Driven Development for Networks