4

The Python Automation Framework – Ansible Basics

The previous two chapters incrementally introduced different ways to interact with network devices. In Chapter 2, Low-Level Network Device Interactions, we discussed Pexpect and Paramiko libraries, which manage an interactive session to control the interactions. In Chapter 3, APIs and Intent-Driven Networking, we started to think of our network in terms of API and intent. We looked at various APIs that contain a well-defined command structure and provide a structured way of getting feedback from the device. As we moved from Chapter 2, Low-Level Network Device Interactions, to Chapter 3, APIs and Intent-Driven Networking, we began to think about our intent for the network and gradually expressed our network in terms of code.

In this chapter, let's expand upon the idea of translating our intention into network requirements. If you have worked on network designs, chances are the most challenging part of the process is not the different pieces of network equipment, but rather the qualifying and translating of business requirements into the actual network design. Your network design needs to solve business problems. For example, you might be working within a larger infrastructure team that needs to accommodate a thriving online e-commerce site that experiences slow site response times during peak hours. How do you determine whether the network is the problem? If the slow response on the website was indeed due to network congestion, which part of the network should you upgrade? Can the rest of the system take advantage of the greater speed and feed?

The following diagram is an illustration of a simple process of the steps that we might go through when trying to translate our business requirements into a network design:

Figure 1: Business logic to network deployment

In my opinion, network automation is not just about faster configuration. It should also be about solving business problems, and accurately and reliably translating our intention into device behavior. These are the goals that we should keep in mind as we march on the network automation journey. In this chapter, we will start to look at a Python-based framework called Ansible, which allows us to declare our intention for the network and abstract even more from the API and CLI.

In this chapter, we will take a look at the following topics:

  • An introduction to Ansible
  • A quick Ansible example
  • The advantages of Ansible
  • The Ansible architecture
  • Ansible Cisco modules and examples
  • Ansible Juniper modules and examples
  • Ansible Arista modules and examples

Ansible – a more declarative framework

Let us imagine ourselves in a hypothetical situation: you woke up one morning in a cold sweat from a nightmare you had about a potential network security breach. You realized that your network contains valuable digital assets that should be protected. You have been doing your job as a network administrator, so it is pretty secure, but you want to put more security measures around your network devices just to be sure.

To start with, you break the objective down into two actionable items:

  • Upgrading the devices to the latest version of the software, which requires:
    1. Uploading the image to the device.
    2. Instructing the device to boot from the new image.
    3. Proceeding to reboot the device.
    4. Verifying that the device is running with the new software image.
  • Configuring the appropriate access control list on the networking devices, which includes the following:
    1. Constructing the access list on the device.
    2. Configuring the access list on the interface, which in most cases is under the interface configuration section so that it can be applied to the interfaces.

Being an automation-focused network engineer, you want to write scripts to reliably configure the devices and receive feedback from the operations. You begin to research the necessary commands and APIs for each of the steps, validate them in the lab, and finally deploy them in production. Having done a fair amount of work for OS upgrade and ACL deployment, you hope the scripts are transferable to the next generation of devices.

Wouldn't it be nice if there was a tool that could shorten this design-develop-deployment cycle?

In this chapter and in Chapter 5, The Python Automation Framework – Beyond Basics, we will work with an open source automation tool called Ansible. It is a framework that can simplify the process of going from business logic to network commands. It can configure systems, deploy software, and orchestrate a combination of tasks.

Ansible is written in Python and has emerged as one of the leading automation tools for Python developers as well as one of the most supported by network equipment vendors. In the 'Python Developers Survey 2018' sponsored by the Python Software Foundation, Ansible is ranked at #1 for configuration management:

Figure 2: Python Software Foundation survey result for Configuration Management (source: https://www.jetbrains.com/research/python-developers-survey-2018/)

At the time of writing this third edition, Ansible release 2.8 can be run from any machine with Python 2 (version 2.7) or Python 3 (Python 3.5 and higher). Just like Python, many of the useful features of Ansible come from the community-driven extension modules. Even with Ansible core module supportability with Python 3, many of the extension modules and production deployments are still in Python 2 mode. It will take some time to bring all the extension modules up from Python 2 to Python 3. Due to this reason, for the rest of this book, we will use Python 2.7 with Ansible 2.8.

Ansible 2.5, released in March 2018, marks an important release for network automation. Starting from version 2.5 and beyond, Ansible starts to offer many new network module features with new connection methods, playbook syntax, and best practices. Given that this was relatively recent, many production deployments are still pre-2.5 release. As a way to maintain backward compatibility, some of the examples will contain pre-2.5 format first, then we will also see examples with the latest release. The differences will be pointed out as we move along in the chapter. It also makes a good learning experience to go from the older style to the newer one, the reasoning behind the newer changes will make more sense.

For the latest information on Ansible Python 3 support, check out: http://docs.ansible.com/ansible/python_3_support.html.

As one can tell from the previous chapters, I am a believer in learning by example. Just like the underlying Python code for Ansible, the syntax for Ansible constructs is easy enough to understand, even if you have not worked with Ansible before. If you have some experience with YAML or Jinja2, you will quickly draw a correlation between the syntax and the intended procedure. Let's take a look at an example first.

A quick Ansible example

As with other automation tools, Ansible started out by managing servers before expanding its ability to manage networking equipment. For the most part, the modules and what Ansible refers to as the 'playbooks' are similar between server modules and network modules, with subtle differences. In this chapter, we will look at a server task example first and draw comparisons later on with network modules.

The control node installation

First, let's clarify the terminology we will use in the context of Ansible. We will refer to the virtual machine with Ansible installed as the control machine or control node, and the machines being managed as the target machines or managed nodes. Ansible can be installed on most of the Unix systems, with the only dependency of Python 2.7 or Python 3.5+. Currently, the Windows operating system is not officially supported as the control machine. Windows hosts can still be managed by Ansible; they are just not supported as the control machine.

As Windows 10 starts to adopt the Windows subsystem for Linux, Ansible might soon be ready to run on Windows as well. For more information, please check the Ansible documentation for Windows (https://docs.ansible.com/ansible/latest/user_guide/windows_faq.html).

On the managed node requirements, you may notice some documentation mentioning that Python 2.7 or later is a requirement. This is true for managing target nodes with operating systems such as Linux, but obviously not all network equipment supports Python. We will see how this requirement is bypassed for networking modules via local execution on the control node.

For Windows, Ansible modules are implemented in PowerShell. Windows modules in the core and extra repository live in a Windows subdirectory if you would like to take a look.

We will be installing Ansible on our Ubuntu virtual machine. For instructions on installation on other operating systems, check out the installation documentation (http://docs.ansible.com/ansible/intro_installation.html). In the following code block, you will see the steps for installing the software packages:

$ sudo apt update
$ sudo apt-get install software-properties-common
$ sudo apt-add-repository ppa:ansible/ansible
$ sudo apt-get install ansible

We can also use pip to install Ansible: pip install ansible. My personal preference is to use the operating system's package management system, such as Apt on Ubuntu.

We can now do a quick verification as follows:

$ ansible --version
ansible 2.8.5
  config file = /etc/ansible/ansible.cfg

Now, let's see how we can run different versions of Ansible on the same control node. This is a useful feature to adopt if you'd like to try out the latest development features without permanent installation. We can also use this method if we intend on running Ansible on a control node for which we do not have root permissions.

Running different versions of Ansible from source

You can run Ansible from a source code checkout (we will look at Git as a version control mechanism in Chapter 13, Working with Git):

$ git clone https://github.com/ansible/ansible.git --recursive
$ cd ansible/
$ source ./hacking/env-setup
...
Setting up Ansible to run out of checkout...
$ ansible --version
ansible 2.10.0.dev0
  config file = /etc/ansible/ansible.cfg
...

As illustrated, we are now running ansible 2.10.0.dev0 in the shell, which is different from the system version of 2.8.5. To run another version, we can simply use git checkout for the different branch or tag and perform the environment setup again:

$ git branch -a
$ git tag --list
$ git checkout v2.5.6
...
HEAD is now at 0c985fee8a New release v2.5.6
$ source ./hacking/env-setup
$ ansible --version
ansible 2.5.6 (detached HEAD 0c985fee8a) last updated 2019/09/23 07:05:28 (GMT -700) 
  config file = /etc/ansible/ansible.cfg

If the Git commands seem a bit strange to you, we will cover Git in more detail in Chapter 13, Working with Git.

Let us switch to Ansible version 2.2 and run the update for the core module:

$ git checkout v2.2.3.0-1
HEAD is now at f5be18f409 New release v2.2.3.0-1
$ source ./hacking/env-setup
$ ansible --version
ansible 2.2.3.0 (detached HEAD f5be18f409) last updated 2019/09/23 07:09:11 (GMT -700) 
   

Git allows the maintainer to include other Git repositories called submodules in a repository. As recommended by Ansible for running from source, https://docs.ansible.com/ansible/latest/installation_guide/intro_installation.html#running-from-source, update the submodule to synchronize with the current release:

$ git submodule update --init --recursive Submodule 'lib/ansible/modules/core'
(https://github.com/ansible/ansible-modules-core) registered for path 'lib/ansible/modules/core'

Let's take a look at the lab topology we will use in this chapter and the next one.

Lab setup

In this chapter and in Chapter 5, The Python Automation Framework – Beyond Basics, our lab will have an Ubuntu 18.04 control node machine with Ansible installed. This control machine will have reachability for the management network for our VIRL devices, which consist of IOSv and NX-OSv devices. We will also have a separate Ubuntu virtual machine for our playbook example when the target machine is a Linux host:

Figure 3: Lab topology

Now, we are ready to see our first Ansible playbook example.

Your first Ansible playbook

Our first playbook will be used between the control node and a remote Ubuntu host. We will take the following steps:

  1. Make sure the control node can use key-based authorization
  2. Create an inventory file
  3. Create a playbook
  1. Execute and test it

The public key authorization

The first thing to do is copy your SSH public key from your control machine to the target machine. A full public key infrastructure tutorial is outside the scope of this book, but here is a quick walk-through on the control node:

$ ssh-keygen -t rsa <<<< generates public-private key pair on the host machine if you have not done so already
$ cat ~/.ssh/id_rsa.pub <<<< copy the content of the output and paste it to the ~/.ssh/authorized_keys file on the target host for the same user, create the file with a text editor such as VI or Emac if the file does not exist.

Because we are using key-based authentication, we can turn off password-based authentication on the remote node and be more secure. You will now be able to use SSH to connect from the control node to the remote node using the private key without being prompted for a password.

Can you automate the initial public key copying? It is possible, but is highly dependent on your use case, regulation, and environment. It is comparable to the initial console setup for network gear to establish initial IP reachability. Do you automate this? Why or why not?

In the next section, let us take a look at how we can indicate the target machines to be managed by Ansible.

The inventory file

We wouldn't need Ansible if we had no remote target to manage, right? Everything starts with the fact that we need to perform some task on a remote host. In Ansible, the way we specify the potential remote target is with an inventory file. We can have this inventory file as the /etc/ansible/hosts file or use the -i option to specify the file during playbook runtime. Personally, I prefer to have this file in the same directory as where my playbook is and use the -i option.

Technically, this file can be named anything you like as long as it is in a valid format. However, the convention is to name this file hosts. You can potentially save yourself and your colleagues some headaches in the future by following this convention.

The inventory file is a simple, plaintext INI-style (https://en.wikipedia.org/wiki/INI_file) file that states your target. By default, the target can either be a DNS FQDN or an IP address:

$ cat hosts 
192.168.2.122

In this case, 192.168.2.122 is the IP address of a Linux machine that is reachable from the Ansible control host. We can now use the command line option to test Ansible and the hosts file:

$ ansible -i hosts 192.168.2.122 -m ping
192.168.2.122 | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

By default, Ansible assumes that the same user executing the playbook exists on the remote host. For example, I am executing the playbook as echou locally; the same user also exists on my remote host. If you want to execute as a different user, you can use the -u option when executing, that is, -u REMOTE_USER.

The previous command line execution shown reads in the host file as the inventory file and executes the ping module on the host with the IP address of 192.168.2.122. Ping (http://docs.ansible.com/ansible/ping_module.html) is a trivial test module that connects to the remote host, verifies a usable Python installation, and returns the output pong upon success.

You may take a look at the ever-expanding module list (http://docs.ansible.com/ansible/list_of_all_modules.html) if you have any questions about the use of existing modules that were shipped with Ansible.

If the host's key is not in the control node's ~/.ssh/known_hosts file, you will get a prompt. Answer 'yes' to add the host. You can disable this by checking on /etc/ansible/ansible.cfg or ~/.ansible.cfg with the following code:

[defaults]
host_key_checking = False

Now that we have validated the inventory file and the Ansible package, we can make our first playbook.

Our first playbook

Playbooks are Ansible's blueprint to describe what you would like to do to the hosts, using modules. This is where we will be spending the majority of our time as operators when working with Ansible. If we use an analogy of building a tree house with Ansible, the playbook will be your manual, the modules will be your tools, while the inventory will be the components that you will be working on when using the tools.

The playbook is designed to be human-readable, and is in YAML format. We will look at the common syntax used in the Ansible architecture section. For now, our focus is to run an example playbook to get the look and feel of Ansible.

Originally, YAML was said to mean Yet Another Markup Language, but now http://yaml.org/ has repurposed the acronym to be YAML Ain't Markup Language.

Let's look at this simple six-line playbook, df_playbook.yml:

---
- hosts: 192.168.2.122
  
  tasks:
      - name: check disk usage
        shell: df > df_temp.txt

In a playbook, there can be one or more plays. In this case, we have one play (lines two to six). In any play, we can have one or more tasks. In our example play, we have just one task (lines four to six). The name field specifies the purpose of the task in a human-readable format and the shell module was used. The module takes one argument of df. The shell module reads in the command in the argument and executes it on the remote host. In this case, we execute the df command to check the disk usage and copy the output to a file named df_temp.txt.

We can execute the playbook via the following code:

$ ansible-playbook -i hosts df_playbook.yml
PLAY [192.168.2.122] ***********************************************************
TASK [setup] *******************************************************************
ok: [192.168.2.122]
TASK [check disk usage] ********************************************************
changed: [192.168.2.122]
PLAY RECAP *********************************************************************
192.168.2.122              : ok=2    changed=1    unreachable=0    failed=0

If you log into the managed host (192.168.2.122, in this example), you will see that the df_temp.txt file contains the output of the df command. Neat, huh?

You may have noticed that there were actually two tasks executed in our output, even though we only specified one task in the playbook; the setup module is automatically added by default. It is executed by Ansible to gather information about the remote host, which can be used later on in the playbook. For example, one of the facts that the setup module gathers is the host's operating system type. What is the purpose of gathering facts about the remote target? You can use this information as a conditional for additional tasks in the same playbook. For example, the playbook can contain additional tasks to install packages. By knowing the operating system type, Ansible can install packages with apt for Debian-based hosts and yum for Red Hat-based hosts.

If you are curious about the output of a setup module, you can find out what information Ansible gathers via $ ansible -i hosts <host> -m setup.

Underneath the hood, there are actually a few things that have happened in relation to our simple task. When the playbook was executed, the control node copied the Python module to the remote host, executed the module, copied the module output to a temporary file, then captured the output and deleted the temporary file. For now, we can probably safely ignore these underlying details until we need them.

It is important that we fully understand the simple process that we have just gone through because we will be referring back to these elements later in this chapter. I purposely chose a server example to be presented here, because this will make more sense as we dive into the networking modules when we need to deviate from them (remember we mentioned the Python interpreter is most likely not available on the network equipment we want to manage).

Congratulations on executing your first Ansible playbook! We will look more into the Ansible architecture, but for now let's take a look at why Ansible is a good fit for network management. Remember that Ansible modules are written in Python; that is one advantage for a Pythonic network engineer, right?

The Advantages of Ansible

There are many infrastructure automation frameworks besides Ansible—namely Chef, Puppet, and SaltStack. Each framework offers its own unique features and models; there is no one right framework that fits all organizations. In this section, I would like to list some of the advantages of Ansible over other frameworks and why I think this is a good tool for network automation.

I will list the advantages of Ansible without comparing them to other frameworks. Other frameworks might adopt some of the same philosophies or certain aspects of Ansible, but rarely do they contain all of the features that I will be mentioning. I believe it is the combination of all the following features and philosophy that makes Ansible ideal for network automation.

Agentless

Unlike some of its peers, Ansible does not require a strict master-client model. No software or agent needs to be installed on the client that communicates back to the server. Outside of the Python interpreter, which many platforms have by default, there is no additional software needed.

For network automation modules, instead of relying on remote host agents, Ansible uses SSH or API calls to push the required changes to the remote host. This further reduces the need for the Python interpreter. This is huge for network device management, as network vendors are typically reluctant to put third-party software on their platforms. SSH, on the other hand, already exists on the network equipment. This mentality has changed a bit in the last few years, but overall, SSH is the common denominator for all network equipment while configuration management agent support is not. As you will remember from Chapter 3, APIs and Intent-Driven Networking, newer network devices also provide an API layer, which can also be leveraged by Ansible.

Because there is no agent on the remote host, Ansible uses a push model to push the changes to the device, as opposed to the pull model where the agent pulls the information from the master server. The push model, in my opinion, is more deterministic as everything originates from the control machine. In a pull model, the timing of the pull might vary from client to client, and therefore results in timing variance.

Again, the importance of being agentless cannot be stressed enough when it comes to working with the existing network equipment. This is usually one of the major reasons network operators and vendors embrace Ansible.

Idempotence

According to Wikipedia, idempotence is the property of certain operations in mathematics and computer science that can be applied multiple times without changing the result beyond the initial application (https://en.wikipedia.org/wiki/Idempotence). In more common terms, it means that running the same procedure over and over again does not change the system after the first time. Ansible aims to be idempotent, which is good for network operations that require a certain order of operations.

The advantage of idempotence is best compared to the Pexpect and Paramiko scripts that we have written. Remember that these scripts were written to push out commands as if an engineer was sitting at the terminal. If you were to execute the script 10 times, the script will make changes 10 times. If we write the same task via the Ansible playbook, the existing device configuration will be checked first, and the playbook will only execute if the changes do not exist. If we execute the playbook 10 times, the change will only be applied during the first run, with the next 9 runs suppressing the configuration change.

Being idempotent means we can repeatedly execute the playbook without worrying that there will be unnecessary changes made. This is important as we need to automatically check for state consistency without any extra overhead.

Simple and extensible

Ansible is written in Python and uses YAML for the playbook language, both of which are considered relatively easy to learn. Remember the Cisco IOS syntax? This is a domain-specific language that is only applicable when you are managing Cisco IOS devices or other similarly structured equipment; it is not a general-purpose language beyond its limited scope. Luckily, unlike some other automation tools, there is no extra domain-specific language or DSL to learn for Ansible because YAML and Python are both widely used as general-purpose languages.

As you can see from the previous example, even if you have not seen YAML before, it is easy to accurately guess what the playbook is trying to do. Ansible also uses Jinja2 as a template engine, which is a common tool used by Python web frameworks such as Django and Flask, so the knowledge is transferable.

I cannot stress enough about the extensibility of Ansible. As illustrated by the preceding example, Ansible starts out with automating server (primarily Linux) workloads in mind. It then branches out to manage Windows machines with PowerShell. As more and more people in the industry started to adapt Ansible, the network became a topic that started to get more attention.

The right people and team were hired at Ansible, network professionals started to get involved, and customers started to demand vendors for support. Starting with Ansible 2.0, network automation has become a first-class citizen alongside server management. The ecosystem is alive and well, with continuous improvement in each of the releases.

Just like the Python community, the Ansible community is friendly, and its attitude is inclusive of new members and ideas. I have first-hand experience of being a noob and trying to make sense of contribution procedures and wishing to write modules to be merged upstream. I can testify to the fact that I felt welcomed and respected for my opinions at all times.

The simplicity and extensibility really speak well for future-proofing. The technology world is evolving fast, and we are constantly trying to adapt to it. Wouldn't it be great to learn a technology once and continue to use it, regardless of the latest trend? Obviously, nobody has a crystal ball to accurately predict the future, but Ansible's track record speaks well for future technology adaptation.

Network vendor support

Let's face it, we don't live in a vacuum. There is a running joke in the industry that the OSI layer should include a layer 8 (money) and 9 (politics). Every day, we need to work with network equipment made by various vendors.

Take API integration as an example. We saw the difference between the Pexpect and API approach in previous chapters. API clearly has an upper hand in terms of network automation. However, the API interface does not come cheap for the vendors. Each vendor needs to invest time, money, and engineering resources to make the integration happen. The willingness of the vendor to support a technology matters greatly in our world. Luckily, all the major vendors support Ansible, as clearly indicated by the ever-increasing available network modules (http://docs.ansible.com/ansible/list_of_network_modules.html).

Why do vendors support Ansible more than other automation tools? Being agentless certainly helps, since having SSH as the only dependency greatly lowers the bar of entry. Engineers who have been on the vendor side know that the feature request process is usually months long and many hurdles have to be jumped over. Any time a new feature is added, it means more time spent on regression testing, compatibility checking, integration reviews, and much more. Lowering the bar of entry is usually the first step in getting vendor support.

The fact that Ansible is based on Python, a language liked by many networking professionals, is another great propeller for vendor support. For vendors such as Juniper and Arista who have already made investments in PyEZ and Pyeapi, they can easily leverage the existing Python modules and quickly integrate their features into Ansible. As you will see in Chapter 5, The Python Automation Framework – Beyond Basics, we can use our existing Python knowledge to easily write our own modules.

Ansible already had a large number of community-driven modules before it focused on networking. The contribution process is somewhat baked and established, or as baked as an open source project can be. The core Ansible team is familiar with working with the community for submissions and contributions.

Another reason for the increased network vendor support also has to do with Ansible's ability to give vendors the ability to express their own strengths in the module context. We will see in the coming section that, besides SSH, the Ansible module can also be executed locally and communicate with these devices by using an API. This ensures that vendors can express their latest and greatest features as soon as they make them available through the API. In terms of network professionals, this means that you can use the cutting-edge features to select the vendors when you are using Ansible as an automation platform.

We have spent a relatively large portion of space discussing vendor support because I feel that this is often an overlooked part in the Ansible story. Having vendors willing to put their weight behind the tool means you, the network engineer, can sleep at night knowing that the next big thing in networking will have a high chance of Ansible support, and you are not locked into your current vendor as your network needs grow.

Now that we've covered the advantages of Ansible, let's explore its architecture.

The Ansible architecture

The Ansible architecture consists of playbooks, plays, and tasks. Take a look at df_playbook.yml, which we used previously:

Figure 4: An Ansible playbook

The whole file is called a playbook, which contains one or more plays. Each play can consist of one or more tasks. In our simple example, we only have one play, which contains a single task. In this section, we will take a look at the following components and terms related to Ansible, some of which we have already seen:

  • YAML: This format is extensively used in Ansible to express playbooks and variables.
  • Inventory: The inventory is where you can specify and group hosts in your infrastructure. You can also optionally specify host and group variables in the inventory file.
  • Variables: Each network device is different. It has a different hostname, IP, neighbor relations, and so on. Variables allow for a standard set of plays while still accommodating these differences.
  • Templates: Templates are nothing new in networking. In fact, you are probably using one without thinking of it as a template. What do we typically do when we need to provision a new device or replace a return merchandise authorization (RMA)? We copy the old configuration over and replace the differences such as the hostname and the loopback IP addresses. Ansible standardizes the template formatting with Jinja2, which we will dive deeper into later on.

In Chapter 5, The Python Automation Framework – Beyond Basics, we will cover some more advanced topics such as conditionals, loops, blocks, handlers, playbook roles, and how they can be included with network management.

YAML

YAML is the syntax used for Ansible playbooks and some other files. The official YAML documentation contains the full specifications of the syntax. Here is a compact version as it pertains to the most common usage for Ansible:

  • A YAML file starts with three dashes (---)
  • Whitespace indentation is used to denote structures when they are lined up, just like Python
  • Comments begin with the hash (#) sign
  • List members are denoted by a leading hyphen (-), with one member per line
  • Lists can also be denoted by square brackets ([]), with elements separated by a comma (,)
  • Dictionaries are denoted by key: value pairs, with a colon for separation
  • Dictionaries can be denoted by curly braces, with elements separated by a comma (,)
  • Strings can be unquoted, but can also be enclosed in double or single quotes

As you can see, YAML maps well into JSON and Python datatypes. If I were to rewrite df_playbook.yml in df_playbook.json, this is what it would look like:

[
  {
    "hosts": "192.168.199.170",
    "tasks": [
      {"name": "check disk usage"},
      {"shell": "df > df_temp.txt"}
    ]
  }
]

This is obviously not a valid playbook, but serves as an aid in helping to understand the YAML formats while using the JSON format as a comparison. Most of the time, comments (#), lists (-), and dictionaries (key: value) are what you will see in a playbook.

Inventories

By default, Ansible looks at the /etc/ansible/hosts file for hosts specified in your playbook. As mentioned previously, I find it more expressive to specify the host file via the -i option. This is what we have been doing up to this point. To expand on our previous example, we can write our inventory host file as follows:

[ubuntu]
192.168.2.122
[nexus]
172.16.1.142
172.16.1.143
[nexus:vars]
username=cisco
password=cisco
[nexus_by_name]
switch1 ansible_host=172.16.1.142
switch2 ansible_host=172.16.1.143

As you may have guessed, the square bracket headings specify group names, so later on in the playbook we can point to this group. For example, in cisco_1.yml and cisco_2.yml, I can act on all of the hosts specified under the nexus group to the group name of nexus:

---
- name: Configure SNMP Contact
  hosts: "nexus"
  gather_facts: false
  connection: local
  <skip>

A host can exist in more than one group. The group can also be nested as children:

[cisco]
router1
router2
[arista]
switch1
switch2
[datacenter:children] 
cisco
arista

In the previous example, the datacenter group includes both the cisco and arista members with four total devices.

We will discuss variables in the next section. There are a few places you can declare your variables, in fact, you have already seen some usage of it. In our first inventory file example, we have declared variables both for hosts and groups in the inventory file. [nexus:vars] specifies variables for the whole nexus group. The ansible_host variable declares variables for each of the hosts on the same line.

For more information on the inventory file, check out the official documentation (http://docs.ansible.com/ansible/intro_inventory.html).

Variables

We discussed variables a bit in the previous section. Why do we need variables? Because our managed nodes are not exactly alike, we need to accommodate the differences via variables. Variable names should be letters, numbers, and underscores, and should always start with a letter. Variables are commonly defined in three locations:

  • The playbook
  • The inventory file
  • Separate files to be included in files and roles

Let's look at an example of defining variables in a playbook, cisco_1.yml:

---
- name: Configure SNMP Contact
  hosts: "nexus"
  gather_facts: false
  connection: local
  vars:
    cli:
      host: "{{ inventory_hostname }}"
      username: cisco
      password: cisco
      transport: cli
  tasks:
    - name: configure snmp contact
      nxos_snmp_contact:
        contact: TEST_1
        state: present
        provider: "{{ cli }}"
      register: output
    - name: show output
      debug:
        var: output

In the playbook, you can see the cli variable declared under the vars section, which is being referenced using the double curly bracket ("{{ cli }}") in the task of nxos_snmp_contact.

For more information on the nxos_snmp_contact module, check out the online documentation (http://docs.ansible.com/ansible/nxos_snmp_contact_module.html).

To reference a variable, you can use the Jinja2 templating system convention of a double curly bracket. You don't need to put quotes around the curly bracket unless you are starting a value with it, but I typically find it easier to just always put quotes around the curly brackets so I do not need to think about the differences.

You may have also noticed the {{ inventory_hostname }} reference, which is not declared in the playbook. It is one of the default variables that Ansible provides for you automatically. It references to the IP address or a DNS-resolvable hostname in the inventory file. Sometimes in the documentation these variables are referred to as 'magic' variables.

There are not many magic variables, and you can find the list in the documentation (http://docs.ansible.com/ansible/playbooks_variables.html#magic-variables-and-how-to-access-information-about-other-hosts).

We have declared variables in an inventory file in the previous section:

[nexus:vars]
username=cisco
password=cisco

[nexus_by_name]
switch1 ansible_host=172.16.1.142
switch2 ansible_host=172.16.1.143

To use the variables in the inventory file instead of declaring them in the playbook, let's add the group variables for [nexus_by_name] in the host file:

[nexus_by_name]
switch1 ansible_host=172.16.1.142
switch2 ansible_host=172.16.1.143

[nexus_by_name:vars]
username=cisco
password=cisco

Then, modify the playbook to match what we can see here in cisco_2.yml, to reference the variables:

---
- name: Configure SNMP Contact
  hosts: "nexus_by_name"
  gather_facts: false
  connection: local

  vars:
    cli:
      host: "{{ ansible_host }}"
      username: "{{ username }}"
      password: "{{ password }}"
      transport: cli

  tasks:
    - name: configure snmp contact
      nxos_snmp_contact:
        contact: TEST_1
        state: present
        provider: "{{ cli }}"

      register: output

    - name: show output
      debug:
        var: output

Notice that in this example, we are referring to the [nexus_by_name] group in the inventory file, the ansible_host host variable, and the username and password group variables.

By offloading the username and password in a separate file, we can make the file write-protected with a better security posture.

To see more examples of variables, check out the Ansible documentation (http://docs.ansible.com/ansible/playbooks_variables.html).

To access complex variable data that's provided in a nested data structure, you can use two different notations. Noted in the nxos_snmp_contact task, we registered the output in a variable and displayed it using the debug module.

You will see something like the following during playbook execution:

$ ansible-playbook -i hosts cisco_2.yml
TASK [show output] *************************************************************
ok: [switch1] => {
    "output": {
        "changed": false,
        "end_state": {
            "contact": "TEST_1"
        },
        "existing": {
            "contact": "TEST_1"
        },
        "proposed": {
            "contact": "TEST_1"
        },
        "updates": []
    }
}

In order to access the nested data, we can use the following notation, as specified in cisco_3.yml:

  tasks:
    - name: configure snmp contact
      nxos_snmp_contact:
        contact: TEST_1
        state: present
        provider: "{{ cli }}"

      register: output

    - name: show output in output["end_state"]["contact"]
      debug:
        msg: '{{ output["end_state"]["contact"] }}'

    - name: show output in output.end_state.contact
      debug:
        msg: '{{ output.end_state.contact }}'

You will receive just the value indicated:

$ ansible-playbook -i hosts cisco_3.yml
TASK [show output in output["end_state"]["contact"]] ***************************
ok: [switch1] => {
    "msg": "TEST_1"
}
ok: [switch2] => {
    "msg": "TEST_1"
}

TASK [show output in output.end_state.contact] *********************************
ok: [switch1] => {
    "msg": "TEST_1"
}
ok: [switch2] => {
    "msg": "TEST_1"
}

Lastly, we mentioned variables can also be stored in a separate file. To see how we can use variables in a role or included file, we should get a few more examples under our belt, because they are a bit complicated to start with. We will see more examples of roles in Chapter 5, The Python Automation Framework – Beyond Basics.

Templates with Jinja2

In the previous section, we used variables with the Jinja2 syntax of {{ variable }}. While you can do a lot of complex things in Jinja2, luckily, we only need some of the basic things to get started with Ansible templates.

Jinja2 (http://jinja.pocoo.org/) is a full-featured, powerful template engine that originated in the Python community. It is widely used in Python web frameworks such as Django and Flask.

For now, it is enough to just keep in mind that Ansible utilizes Jinja2 as the template engine. We will revisit the topics of Jinja2 filters, tests, and lookups as the situations call for them.

You can find more information on the Ansible Jinja2 template here: http://docs.ansible.com/ansible/playbooks_templating.html.

This concludes our rundown of Ansible's architecture. In the next section, we'll look at Ansible networking modules where most of the network tasks we will encounter will be handled in Ansible.

Ansible networking modules

Ansible was originally made for managing nodes with full operating systems such as Linux and Windows before it was extended to support network equipment. You may have already noticed the subtle differences in playbooks that we have used so far for network devices, such as the lines of gather_facts: false and connection: local; we will take a closer look at the differences in the following sections.

Ansible provides nicely written documentation on 'How Network Automation is Different': https://docs.ansible.com/ansible/latest/network/getting_started/network_differences.html.

Local connections and facts

Ansible modules are Python code that's executed on the remote host by default. Because of the fact that most network equipment does not expose Python directly, or it simply does not contain Python, we are almost always executing the playbook locally on the control node. This means that the playbook is interpreted locally first and commands or configurations are pushed out later, as needed.

Recall that in our server example, the remote host facts were gathered via the setup module, which was added by default. Since we are executing the playbook locally, the setup module will gather the facts on the localhost instead of the remote host. This is certainly not needed, therefore when the connection is set to local, we can reduce this unnecessary step by setting the fact gathering to no or false. Starting from release 2.5, there are fact-gathering modules specific to each platform. You can take a look at the fact-demo.yml example: https://docs.ansible.com/ansible/latest/network/user_guide/network_best_practices_2.5.html#step-2-creating-the-playbook.

Because network modules are executed locally, for those modules that offer a backup option, the files are backed up locally on the control node as well.

One of the most important changes introduced in Ansible 2.5 was the introduction of different network communication protocols (https://docs.ansible.com/ansible/latest/network/getting_started/network_differences.html#multiple-communication-protocols).

The connection method now includes network_cli, netconf, httpapi, and local. If the network device uses CLI over SSH, you indicate the connection method as network_cli in one of the device variables. It is good to be aware of both of the pre-2.5 as well as the post-2.5 connection syntax. In general, you will find the post-2.5 syntax more streamlined and concise.

Provider arguments

As we have seen from Chapter 2, Low-Level Network Device Interactions, and Chapter 3, APIs and Intent-Driven Networking, network equipment can be connected via both SSH and API, depending on the platform and software release. All core networking modules implement a provider argument, which is a collection of arguments used to define how to connect to the network device. Some modules only support cli while some support other values, for example, Arista supports eAPI and Cisco supports NX-API on the Nexus platform.

Starting with Ansible 2.5, the recommended way to specify the transport method is by using the connection variable. You will start to see the provider parameter being gradually phased out from future Ansible releases. Using the ios_command module as an example, https://docs.ansible.com/ansible/latest/modules/ios_command_module.html#ios-command-module, the provider parameter still works, but is being labeled as deprecated. We will see an example of this later in this chapter.

Some of the basic arguments supported by the provider transport are as follows:

  • host: This defines the remote host
  • port: This defines the port to connect to
  • username: This is the username to be authenticated
  • password: This is the password to be authenticated
  • transport: This is the type of transport for the connection
  • authorize: This enables privilege escalation for devices that require it
  • auth_pass: This defines the privilege escalation password

As you can see, not all arguments need to be specified in the provider variable. For example, for our previous playbooks, our user always has the admin privilege when logged in, therefore we do not need to specify the authorize or the auth_pass arguments.

These arguments are just variables, so they follow the same rules for variable precedence. For example, let's say I change cisco_3.yml to cisco_4.yml and observe the following precedence:

---
- name: Configure SNMP Contact
  hosts: "nexus_by_name"
  gather_facts: false
  connection: local
  vars:
    cli:
      host: "{{ ansible_host }}"
      username: "{{ username }}"
      password: "{{ password }}"
      transport: cli
  tasks:
    - name: configure snmp contact
      nxos_snmp_contact:
        contact: TEST_1
        state: present
        username: cisco123 #new
        password: cisco123 #new
        provider: "{{ cli }}"
      register: output
    - name: show output in output["end_state"]["contact"]
      debug:
        msg: '{{ output["end_state"]["contact"] }}'
    - name: show output in output.end_state.contact
      debug:
        msg: '{{ output.end_state.contact }}'

The username and password defined at the task level will override the username and password at the playbook level. I will receive the following error when trying to connect because the user does not exist on the device:

PLAY [Configure SNMP Contact] **************************************************
TASK [configure snmp contact] **************************************************
fatal: [switch2]: FAILED! => {"changed": false, "failed": true, "msg": "failed to connect to 172.16.1.143:22"}
fatal: [switch1]: FAILED! => {"changed": false, "failed": true, "msg": "failed to connect to 172.16.1.142:22"}
to retry, use: --limit @/home/echou/Mastering_Python_Networking_third_edition/Chapter04/cisco_4.retry
PLAY RECAP *********************************************************************
switch1    : ok=0    changed=0    unreachable=0    failed=1
switch2    : ok=0    changed=0    unreachable=0    failed=1

In the next section, we will dive deeper into examples for managing Cisco devices.

The Ansible Cisco example

Cisco's support in Ansible is categorized by the operating systems IOS, IOS-XR, and NX-OS. We have already seen a number of NX-OS examples, so in this section let's try to manage IOS-based devices.

Our host file will consist of two hosts, ios-r1 and ios-r2:

[ios-devices]
ios-r1 ansible_host=172.16.1.134
ios-r2 ansible_host=172.16.1.135

[ios-devices:vars]
username=cisco
password=cisco

Our playbook, cisco_5.yml, will use the ios_command module to execute arbitrary show commands:

---
- name: IOS Show Commands
  hosts: "ios-devices"
  gather_facts: false
  connection: local

  vars:
    cli:
      host: "{{ ansible_host }}"
      username: "{{ username }}"
      password: "{{ password }}"
      transport: cli

  tasks:
    - name: ios show commands
      ios_command:
        commands:
          - show version | i IOS
          - show run | i hostname
        provider: "{{ cli }}"

      register: output

    - name: show output in output["end_state"]["contact"]
      debug:
          var: output

The result is what we would expect as the show version and show run output:

$ ansible-playbook -i hosts cisco_5.yml

PLAY [IOS Show Commands] *******************************************************

TASK [ios show commands] *******************************************************
ok: [ios-r1]
ok: [ios-r2]

TASK [show output in output["end_state"]["contact"]] ***************************
ok: [ios-r1] => {
    "output": {
        "changed": false,
        "stdout": [
            "Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.6(3)M2, RELEASE SOFTWARE (fc2)
ROM: Bootstrap program is IOSv
Cisco IOSv (revision 1.0) with  with 460033K/62464K bytes of memory.",
            "hostname iosv-1"
        ],
        "stdout_lines": [
            [
                "Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.6(3)M2, RELEASE SOFTWARE (fc2)",
                "ROM: Bootstrap program is IOSv",
                "Cisco IOSv (revision 1.0) with  with 460033K/62464K bytes of memory."
            ],
            [
                "hostname iosv-1"
            ]
        ],
        "warnings": []
    }
}
ok: [ios-r2] => {
    "output": {
        "changed": false,
        "stdout": [
            "Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.6(3)M2, RELEASE SOFTWARE (fc2)
ROM: Bootstrap program is IOSv
Cisco IOSv (revision 1.0) with  with 460033K/62464K bytes of memory.",
            "hostname iosv-2"
        ],
        "stdout_lines": [
            [
                "Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Version 15.6(3)M2, RELEASE SOFTWARE (fc2)",
                "ROM: Bootstrap program is IOSv",
                "Cisco IOSv (revision 1.0) with  with 460033K/62464K bytes of memory."
            ],
            [
                "hostname iosv-2"
            ]
        ],
        "warnings": []
    }
}

PLAY RECAP *********************************************************************
ios-r1    : ok=2    changed=0    unreachable=0    failed=0
ios-r2    : ok=2    changed=0    unreachable=0    failed=0

I want to point out a few things illustrated by this example:

  • The playbook between NX-OS and IOS is largely identical
  • The syntax nxos_snmp_contact and ios_command modules follow the same pattern, with the only difference being the argument for the modules
  • The IOS version of the devices is pretty old with no understanding of API, but the modules still have the same look and feel

As you can see from the preceding example, once we have the basic syntax down for the playbooks, the subtle difference relies on the different modules for the task we would like to perform.

Ansible 2.8 playbook example

We have briefly talked about the addition of network connection changes in Ansible playbooks, starting with version 2.5 and beyond. Along with the changes, Ansible also released a network best practices document: https://docs.ansible.com/ansible/latest/network/user_guide/network_best_practices_2.5.html. Let's build an example based on the best practices guide. Since there are multiple files involved in this example, the files are grouped into a subdirectory named ansible_2-8_example with the course code files.

Either use the system installed version or switch back to Ansible version 2.8 using the Git source code as illustrated before:

$ ansible --version
ansible 2.8.5

In our previous examples, we mainly used just the inventory host file to contain both the inventory information as well as the associated variables. In this example, we will offload the variables to a separate directory named host_vars:

$ tree.
.
├── hosts
├── host_vars
│   ├── iosv-1
│   └── iosv-2
└── my_playbook.yml
1 directory, 4 files

Our inventory file is reduced to the group and the name of the hosts:

$ cat hosts
[ios-devices]
iosv-1
iosv-2

In the host_vars directory there are two files. Each corresponds to the name specified in the inventory file:

$ ls host_vars/
iosv-1
iosv-2

The variable file for the hosts contains what was previously included in the CLI variable. The additional variable of ansible_connection specifies network_cli as the transport:

$ cat host_vars/iosv-1
---
ansible_host: 172.16.1.134
ansible_user: cisco
ansible_ssh_pass: cisco
ansible_connection: network_cli
ansible_network_os: ios
ansbile_become: yes
ansible_become_method: enable
ansible_become_pass: cisco

$ cat host_vars/iosv-2
---
ansible_host: 172.16.1.135
ansible_user: cisco
ansible_ssh_pass: cisco
ansible_connection: network_cli
ansible_network_os: ios
ansbile_become: yes
ansible_become_method: enable
ansible_become_pass: cisco

Our playbook will use the ios_config module with the backup option enabled. Notice the use of the when condition in this example, so that if there are other hosts with a different operating system, this task will not be applied:

$ cat ansible2-8_playbook.yml
---
- name: Chapter 4 Ansible 2.8 Best Practice Demonstration
  connection: network_cli
  gather_facts: false
  hosts: all
  tasks:
    - name: backup
      ios_config:
        backup: yes
      register: backup_ios_location
      when: ansible_network_os == 'ios'

When the playbook is run, a new backup folder will be created with the configuration backed up for each of the hosts:

$ ansible-playbook -i hosts ansible2-8_playbook.yml
PLAY [Chapter 4 Ansible 2.8 Best Practice Demonstration] *********************************
TASK [backup] ****************************************************************************
changed: [iosv-2]
changed: [iosv-1]
PLAY RECAP *******************************************************************************
iosv-1    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
iosv-2    : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

We can see the newly created backup directory with two files inside:

$ tree
.
├── ansible2-8_playbook.yml
├── backup
│   ├── iosv-1_config.2019-09-24@10:40:36
│   └── iosv-2_config.2019-09-24@10:40:36
├── hosts
└── host_vars
    ├── iosv-1
    └── iosv-2
2 directories, 6 files
$ head -20 backup/iosv-1_config.2019-09-24@10:40:36
Building configuration...

Current configuration : 4598 bytes
!
! Last configuration change at 17:02:29 UTC Sun Sep 22 2019
!
version 15.6
service timestamps debug datetime msec
service timestamps log datetime msec
no service password-encryption
!
hostname iosv-1
!
boot-start-marker
boot-end-marker
!
!
vrf definition Mgmt-intf
 !

This example illustrates the network_connection variable and the recommended structure based on Ansible network best practices. We will look at offloading variables into the host_vars directory and conditionals in Chapter 5, The Python Automation Framework – Beyond Basics. This structure can also be used for the Juniper and Arista examples in this chapter. For the different devices, we will just use different values for network_connection as we will see from the Juniper examples in the next section.

The Ansible Juniper example

The Ansible Juniper module requires the Juniper PyEZ package and NETCONF. If you have been following the API example in Chapter 3, APIs and Intent-Driven Networking, you are good to go. If not, refer back to that section for installation instructions as well as some test script to make sure PyEZ works. The Python package called jxmlease is also required:

(venv) $ pip install jxmlease

In the host file, we will specify the device and connection variables:

[junos_devices]
J1 ansible_host=192.168.24.252

[junos_devices:vars]
username=juniper
password=juniper!

In our Juniper playbook, we will use the junos_facts module to gather basic facts for the device. This module is equivalent to the setup module and will come in handy if we need to take action depending on the returned value. Note the different values of transport and port in the example here:

---
- name: Get Juniper Device Facts
  hosts: "junos_devices"
  gather_facts: false
  connection: local

  vars:
    netconf:
      host: "{{ ansible_host }}"
      username: "{{ username }}"
      password: "{{ password }}"
      port: 830
      transport: netconf

  tasks:
    - name: collect default set of facts
      junos_facts:
        provider: "{{ netconf }}"

      register: output

    - name: show output
      debug:
          var: output

When executed, you will receive this output from the Juniper device:

PLAY [Get Juniper Device Facts]
************************************************

TASK [collect default set of facts]
******************************************** ok: [J1]

TASK [show output]
************************************************************* 
ok: [J1] "
<skip>

PLAY RECAP
********************************************************************* J1 : ok=2 changed=0 unreachable=0 failed=0

Now that we have seen examples of managing Cisco and Juniper devices, let us compare them with some examples where the target machines are Arista devices.

The Ansible Arista example

The final playbook example we will look at will be the Arista command module. At this point, we are quite familiar with our playbook syntax and structure. The Arista device can be configured to use transport using cli or eapi, so, in this example, we will use cli.

This is the host file:

[eos-devices]
arista1 ansible_host=192.168.199.158

The playbook is also similar to what we have seen previously:

---
- name: EOS Show Commands
  hosts: "eos_devices"
  gather_facts: false
  connection: local

  vars:
    cli:
      host: "{{ ansible_host }}"
      username: "arista"
      password: "arista"
      authorize: true
      transport: cli

  tasks:
    - name: eos show commands
      eos_command:
        commands:
          - show version | i Arista
        provider: "{{ cli }}"

      register: output

    - name: show output
      debug:
          var: output

As you can see from the Arista example, it is not that different from Cisco or Juniper in terms of structure. This speaks to the strength of using Ansible; even with a new vendor that we have never worked with before, using Ansible can provide a layer of structure to be followed.

Summary

In this chapter we took a grand tour of the open source automation framework, Ansible. Unlike Pexpect-based and API-driven network automation scripts, Ansible provides a higher layer of abstraction called a playbook to automate our network devices.

Ansible was originally constructed to manage servers and was later extended to network devices; therefore, we took a look at a server example. Then, we compared and contrasted the differences when it came to network management playbooks. Later, we looked at the example playbooks for Cisco IOS, Juniper JUNOS, and Arista EOS devices. We also looked at the best practices recommended by Ansible if you are using the latest Ansible release, version 2.8.

In Chapter 5, The Python Automation Framework – Beyond Basics, we will leverage the knowledge we gained in this chapter and start to look at some of the more advanced features of Ansible, such as group variables, templates, and conditional statements.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset