Chapter 5. Externalizing Puppet Configuration

In Chapter 2 we talked about the ways that you could define your hosts or nodes to Puppet. We talked about specifying them in a variety of forms as node statements in your Puppet manifest files. We also mentioned that Puppet has the capability to store node information in external sources. This avoids the need to specify large numbers of nodes manually in your manifests files, a solution which is time-consuming and not scalable.

Puppet has two ways to store node information externally:

  • External Node Classification

  • LDAP server classification

The first capability is called External Node Classification (ENC). ENC is a script-based integration system that Puppet queries for node data. The script returns classes, inheritance, variables and environment configuration that Puppet can then use to define a node and configure your hosts.

Tip

External node classifiers are also one of the means by which tools like the Puppet Dashboard and Foreman can be integrated into Puppet and provide node information, as you will see in Chapter 7.

The second capability allows you to query Lightweight Directory Access Protocol (LDAP) directories for node information. This integration is used less often than ENCs, but it is especially useful because you can specify an existing LDAP directory, for example your asset management database or an LDAP DNS back end, for your node data.

Using external node classification, either via an ENC or via LDAP, is the recommended way to scale your Puppet implementation to cater for large volumes of hosts. Most of the multi-thousand node sites using Puppet, for example Google and Zynga, make use of external node classification systems to allow them to deal with the large number of nodes. Rather than managing files containing hundreds, thousands or even tens of thousands of node statements, you can use this:

node mail.example.com { . . . }
node web.example.com { . . . }
node db.example.com { . . . }
. . .

This allows you to specify a single source of node information and make quick and easy changes to that information without needing to edit files.

In this chapter, we discuss both approaches to storing node information in external sources. First we look at creating an external node classifier, and we provide some simple examples of these for you to model your own on; then we demonstrate the use of the LDAP node classifier.

External Node Classification

Writing an ENC is very simple. An ENC is merely a script that takes a node name, for example mail.example.com, and then returns the node's configuration in the form of YAML data. YAML or Yet Another Markup Language (http://www.yaml.org/) is a serialization language used in a variety of programming languages. YAML is human-friendly, meaning it's structured and is designed to be easy for humans to read. It is often used as a configuration file format; for example, the database configuration file used in Ruby on Rails applications, database.yml, is a YAML file.

Let's look at some simple YAML examples to get an idea for how it works. YAML is expressed in a hash where structure is important. Let's start by specifying a list of items:

---
- foo
- bar
- baz
- qux

The start of a YAML document is identified with three dashes, "---". Every ENC needs to return these three dashes as the start of its output. We've then got a list of items preceded by dashes.

We can also express the concept of assigning a value to an item, for example:

---
foo: bar

Here we've added our three dashes and then expressed that the value of item "foo" is "bar." We can also express grouped collections of items (which we're going to use heavily in our ENCs):

---
foo:
 - bar
baz:
 - qux

We've again started with our three dashes and then specified the names of the lists we're creating: foo and baz. Inside each list are the list items, again preceded with a dash, but this time indented one space to indicate their membership of the list.

This indentation is very important. For the YAML to be valid, it must be structured correctly. This can sometimes be a real challenge but there are some tools you can use to structure suitable YAML. For example, VIM syntax highlighting will recognize YAML (if the file you're editing has a .yml or .yaml extension) or you can use the excellent Online YAML Parser to confirm the YAML you're generating is valid: http://yaml-online-parser.appspot.com/.

But before we generate our first YAML node, we need to configure Puppet to use an external node classifier instead of our file-based node configuration.

Note

You can see a more complete example of structured YAML at http://www.yaml.org/start.html.

Configuring Nodes Using An External Node Classifier

To use external nodes, we first need to tell Puppet to use a classifier to configure our nodes rather than use node definitions. We do this by specifying the node_terminus option and the name and location of our classifier in the [master] (or [puppetmasterd] in pre-2.6.0 versions) section of the puppet.conf configuration file on our Puppet master. You can see this in Listing 5-1, where we've specified a classifier called puppet_node_classifier located in the /usr/bin directory.

Example 5-1. The external_nodes configuration option

[master]
node_terminus = exec
external_nodes = /usr/bin/puppet_node_classifier

The node_terminus configuration option is used to configure Puppet for node sources other than the default flat file manifests. The exec option tells Puppet to use an external node classifier script.

A classifier can be written in any language, for example shell script, Ruby, Perl, Python, or a variety of other languages. The only requirement is that the language can output the appropriate YAML data. For example, you could also easily add a database back end to a classifier that queries a database for the relevant hostname and returns the associated classes and any variables.

Following are some example node classifiers written in different languages.

Note

You can have nodes specified both in Puppet manifests and external node classifiers. For this to work correctly, though, your ENC must return an empty YAML hash.

An External Node Classifier in a Shell Script

In Listing 5-2, you can see a very simple node classifier, the puppet_node_classifier script we specified in Listing 5-1. This classifier is written in shell script.

Example 5-2. Simple Node Classifier

#!/bin/sh
cat <<"END"
---
classes:
  - base
parameters:
  puppetserver: puppet.example.com
END
exit 0

The script in Listing 5-2 will return the same classes and variables each time it is called irrelevant of what hostname is passed to the script.

$ puppet_node_classifier web.example.com

Will return:

---
classes:
  - base
parameters:
  puppetserver: puppet.example.com

The classes block holds a list of the classes that belong to this node, and the parameters block contains a list of the variables that this node specifies. In this case, the node includes the base class and has a variable called $puppetserver with a value of puppet.example.com.

Puppet will use this data to construct a node definition as if we'd defined a node statement. That node statement would look like Listing 5-3.

Example 5-3. Node definition from Listing 5-2's classifier

node web.example.com {
       $puppetserver = 'puppet.example.com'
       include base
}

This is the simplest ENC that we can devise. Let's look at some more complex variations of this script that can return different results depending on the particular node name being passed to the classifier, in the same way different nodes would be configured with different classes, definitions, and variables in your manifest files.

Tip

Any parameters specified in your ENC will be available as top-scope variables.

A Ruby External Node Classifier

Let's look at another example of an ENC, this time specifying a list of hosts or returning an empty YAML hash if the host is not found. This ENC is written in Ruby, and you can see it in Listing 5-4.

Example 5-4. Ruby node classifier

#!/usr/bin/env ruby

require 'yaml'

node = ARGV[0]
default = { 'classes' => []}

unless node =~ /(^S+).(S+.S+)$/
print default.to_yaml
  exit 0
end

hostname = $1

base = { 'environment' => 'production',
         'parameters' => {
                    'puppetserver' => 'puppet.example.com'
         },
         'classes' => [ 'base' ],
       }

case hostname
  when /^web?w+$/
     web = { 'classes' => 'apache' }
     base['classes'] << web['classes']
     puts YAML.dump(base)
  when /^db?w+$/
     db = { 'classes' => 'mysql' }
     base['classes'] << db['classes']
     puts YAML.dump(base)
  when /^mail?w+$/
     mail = { 'classes' => 'postfix' }
     base['classes'] << mail['classes']
     puts YAML.dump(base)
  else
    print default.to_yaml
end

exit 0

Our simple ENC here captures the incoming node name and rejects and returns an empty hash (defined in the default variable) if it is not an appropriately formed fully-qualified domain name (FQDN).

We then set up some basic defaults, the puppetserver variable, our environment, and a base class. The ENC then takes the host name portion of the FQDN and checks it against a list of host names, for example matching it against web, web1, web123 and so on for database and mail hosts.

For example, if we passed the ENC a node name of web.example.com, it would return a YAML hash of:

---
parameters:
  puppetserver: puppet.example.com
classes:
 - base
 - apache
environment: production

Which would result in a node definition of:

node web.example.com {
  $puppetserver = puppet.example.com
  include base
  include apache
}

This would specify that this node belonged to the production environment.

If the ENC doesn't match any host names, then it will return an empty YAML hash.

A Perl External Node Classifier

In Listing 5-5, you can see another node classifier written in Perl.

Example 5-5. Perl-based node classifier

#!/usr/bin/perl -w
use strict;
use YAML qw( Dump );

my $hostname = shift || die "No hostname passed";

$hostname =~ /^(w+).(w+).(w{3})$/
    or die "Invalid hostname: $hostname";

my ( $host, $domain, $net ) = ( $1, $2, $3 );

my @classes = ( 'base', $domain );
my %parameters = (
    puppetserver  => "puppet.$domain.$net"
    );

print Dump( {
    classes   => @classes,
    parameters => \%parameters,
} );

In Listing 5-5, we've created a Perl node classifier that makes use of the Perl YAML module. The YAML module can be installed via CPAN or your distribution's package management system. For example, on Debian it is the libyaml-perl package, or on Fedora it is the perl-YAML package.

The classifier slices our hostname into sections; it assumes the input will be a fully qualified domain name and will fail if no hostname or an inappropriately structured hostname is passed. The classifier then uses those sections to classify the nodes and set parameters. If we called this node classifier with the hostname web.example.com, it would return a node classification of:

---
classes:
  - base
  - example
parameters:
  puppetserver: puppet.example.com

This would result in a node definition in Puppet structured like:

node 'web.example.com' {
       include base, example

       $puppetserver = "puppet.example.com"
}

Note

From Puppet 2.6.5 and later, you can also specify parameterized classes and resources in external node classifiers (see http://docs.puppetlabs.com/guides/external_nodes.html for more details).

Back-Ending a Node Classification

Lastly, as mentioned, we could also back-end our node classification script with a database, as you can see in Listing 5-6.

Example 5-6. A database back-end node classifier

#!/usr/bin/perl -w
use strict;
use YAML qw( Dump );
use DBI;

my $hostname = shift || die "No hostname passed";

$hostname =~ /^(w+).(w+).(w{3})$/
    or die "Invalid hostname: $hostname";

my ( $host, $domain, $net ) = ( $1, $2, $3 );

# MySQL Configuration
my $data_source = "dbi:mysql:database=puppet;host=localhost";
my $username = "puppet";
my $password = "password";

# Connect to the server
my $dbh = DBI->connect($data_source, $username, $password)
    or die $DBI::errstr;

# Build the query
my $sth = $dbh->prepare( qq{SELECT class FROM nodes WHERE node = '$hostname'})
    or die "Can't prepare statement: $DBI::errstr";

# Execute the query
my $rc = $sth->execute
or die "Can't execute statement: $DBI::errstr";

# Set parameters
my %parameters = (
    puppet_server  => "puppet.$domain.$net"
    );

# Set classes
my @class;
while (my @row=$sth->fetchrow_array)
 { push(@class,@row) }

# Check for problems
die $sth->errstr if $sth->err;

# Disconnect from database
$dbh->disconnect;

# Print the YAML
print Dump( {
    classes    => @class,
    parameters => \%parameters,
} );

This node classifier would connect to a MySQL database called puppet running on the local host. Using the hostname, the script receiving it would query the database and return a list of classes to assign to the node. The nodes and classes would be stored in a table. The next lines comprise a SQL statement to create a very simple table to do this:

CREATE TABLE `nodes` (
`node` varchar(80) NOT NULL,
`class` varchar(80) NOT NULL ) TYPE=MyISAM;

The classes, and whatever parameters we set (which you could also place in the database in another table), are then returned and outputted as the required YAML data.

Tip

You can also access fact values in your node classifier scripts. Before the classifier is called, the $vardir/yaml/facts/ directory is populated with a YAML file named for the node containing fact values, for example /var/lib/puppet/yaml/facts/web.example.com.yaml. This file can be queried for fact values.

All of these external node classifiers are very simple and could easily be expanded upon to provide more sophisticated functionality. It is important to remember that external nodes override node configuration in your manifest files. If you enable an external node classifier, any duplicate node definitions in your manifest files will not be processed and will in fact be ignored by Puppet.

Note

In Puppet versions earlier than 0.23, external node scripts were structured differently. We're not going to cover these earlier scripts, but you can read about them at http://docs.puppetlabs.com/guides/external_nodes.html.

Storing Node Configuration in LDAP

In addition to external node classifiers, Puppet also allows the storage of node information in LDAP directories. Many organizations already have a wide variety of information about their environments, such as DNS, user and group data, stored in LDAP directories. This allows organizations to leverage these already-existing assets stored in LDAP directories or to decouple their configuration from Puppet and centralize it. Additionally, it also allows LDAP-enabled applications to have access to your configuration data.

Note

The use of LDAP nodes overrides node definitions in your manifest files and your ENC. If you use LDAP node definitions, you cannot define nodes in your manifest files or in an ENC.

Installing Ruby LDAP Libraries

The first step in using LDAP for your node configuration is to ensure the Ruby LDAP libraries are installed. First, check for the presence of the LDAP libraries:

# ruby -rldap -e "puts :installed"

If this command does not return installed, the libraries are not installed. You can either install them via your distribution's package management system or download them from the Ruby/LDAP site. For Red hat and derivatives, this is the ruby-ldap package. For Ubuntu/Debian, the package is libldap-ruby1.8.

If there isn't a package for your distribution, you can download the required libraries either in the form of an RPM or a source package from the Ruby/LDAP site. The Ruby/LDAP site is located at http://ruby-ldap.sourceforge.net/.

Check out the current Ruby LDAP source code:

$ svn checkout http://ruby-activeldap.googlecode.com/svn/ldap/trunk/ ruby-ldap-ro

Then, change into the resulting directory and then make and install the code:

$ cd ruby-ldap-ro
$ ruby extconf.rb
$ sudo make && make install

Setting Up the LDAP Server

Next, you need to set up your LDAP server. We're going to assume you've either already got one running or can set one up yourself. For an LDAP server, you can use OpenLDAP, Red Hat Directory Server (or Fedora Directory Server), Sun's Directory Server, or one of a variety of other servers. We're going to use OpenLDAP for the purposes of demonstrating how to use LDAP node definitions.

Tip

For some quick start instructions on setting up OpenLDAP, you can refer to http://www.openldap.org/doc/admin23/quickstart.html.

Adding the Puppet Schema

Now we need to add the Puppet schema to our LDAP directory's configuration.

Warning

You may need to tweak or translate the default LDAP schema for some directory servers, but it is suitable for OpenLDAP.

The Puppet schema document is available in the Puppet source package in the ext/ldap/puppet.schema file, or you can take it from the project's Git repository at https://github.com/puppetlabs/puppet/blob/master/ext/ldap/puppet.schema.

We need to add it to our schema directory and slapd.conf configuration file. For example, on an Ubuntu or Debian host, the schema directory is /etc/ldap/schema, and the slapd.conf configuration is located in the /etc/ldap directory. On Red Hat, the configuration file is located in /etc/openldap and the schemas are located in /etc/openldap/schema. Copy the puppet.schema file into the appropriate directory, for example on Ubuntu:

$ cp puppet/ext/ldap/puppet.schema /etc/ldap/schema

Now you can add an include statement to your slapd.conf configuration file; there should be a number of existing statements you can model:

include        /etc/ldap/schema/puppet.schema

Or you can add a schema to a running OpenLDAP server, like so:

$ ldapadd -x -H ldap://ldap.example.com/ -D "cn=config" -W -f puppet.ldif

To update OpenLDAP with the new schema, you may also now need to restart your server.

# /etc/init.d/slapd restart

Now that you've added the schema and configured the LDAP server, you need to tell Puppet to use an LDAP server as the source of its node configuration.

Configuring LDAP in Puppet

LDAP configuration is very simple. Let's look at the required configuration options from the [master] section of the puppet.conf configuration file in Listing 5-7.

Example 5-7. LDAP configuration in Puppet

[master]
node_terminus = ldap
ldapserver = ldap.example.com
ldapbase = ou=Hosts,dc=example,dc=com

First, we set the node_terminus option to ldap to tell Puppet to look to an LDAP server as our node source. Next, we specify the hostname of our LDAP server, in this case ldap.example.com, in the ldapserver option. Lastly, in the ldapbase option, we specify the base search path. Puppet recommends that hosts be stored in an OU called Hosts under our main directory structure, but you can configure this to suit your environment.

If required, you can specify a user and password using the ldapuser and ldappassword options and override the default LDAP port of 389 with the ldapport option. There is some limited support for TLS or SSL, but only if your LDAP server does not require client-side certificates.

Tip

You can see a full list of the potential LDAP options at http://docs.puppetlabs.com/references/stable/configuration.html.

After configuring Puppet to use LDAP nodes, you should restart your Puppet master daemon to ensure that the new configuration is updated.

Now you need to add your node configuration to the LDAP server. Let's take a quick look at the Puppet LDAP schema in Listing 5-9.

Example 5-8. The LDAP schema

attributetype ( 1.3.6.1.4.1.34380.1.1.3.10 NAME 'puppetClass'
        DESC 'Puppet Node Class'
        EQUALITY caseIgnoreIA5Match
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )

attributetype ( 1.3.6.1.4.1.34380.1.1.3.9 NAME 'parentNode'
        DESC 'Puppet Parent Node'
        EQUALITY caseIgnoreIA5Match
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.26
        SINGLE-VALUE )

attributetype ( 1.3.6.1.4.1.34380.1.1.3.11 NAME 'environment'
        DESC 'Puppet Node Environment'
        EQUALITY caseIgnoreIA5Match
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )

attributetype ( 1.3.6.1.4.1.34380.1.1.3.12 NAME 'puppetVar'
        DESC 'A variable setting for puppet'
        EQUALITY caseIgnoreIA5Match
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.26 )
objectclass ( 1.3.6.1.4.1.34380.1.1.1.2 NAME 'puppetClient' SUP top AUXILIARY
        DESC 'Puppet Client objectclass'
        MAY ( puppetclass $ parentnode $ environment $ puppetvar ))

The Puppet schema is made up of an object class, puppetClient, and four attributes: puppetclass, parentnode, environment and puppetvar. The object class puppetClient is assigned to each host that is a Puppet node. The puppetclass attribute contains all of the classes defined for that node. At this stage, you cannot add definitions, just classes. The parentnode attribute allows you to specify node inheritance, environment specifies the environment of the node, and puppetvar specifies any variables assigned to the node.

In addition, any attributes defined in your LDAP node entries are available as variables to Puppet. This works much like Facter facts (see Chapter 1); for example, if the host entry has the ipHost class, the ipHostNumber attribute of the class is available as the variable $ipHostNumber. You can also specify attributes with multiple values; these are created as arrays.

You can also define default nodes in the same manner as doing so in your manifest node definitions: creating a host in your directory called default. The classes assigned to this host will be applied to any node that does not match a node in the directory. If no default node exists and no matching node definition is found, Puppet will return an error.

You can now add your hosts, or the relevant object class and attributes to existing definitions for your hosts, in the LDAP directory. You can import your host definitions using LDIF files or manipulate your directory using your choice of tools such as phpldapadmin (http://phpldapadmin.sourceforge.net/wiki/index.php/Main_Page).

Listing 5-9 is an LDIF file containing examples of node definitions.

Example 5-9. LDIF nodes

# LDIF Export for: ou=Hosts,dc=example,dc=com
dn: ou=Hosts,dc=example,dc=com
objectClass: organizationalUnit
objectClass: top
ou: Hosts

dn: cn=default,ou=Hosts,dc=example,dc=com
cn: default
description: Default
objectClass: device
objectClass: top
objectClass: puppetClient
puppetclass: base

dn: cn=basenode,ou=Hosts,dc=example,dc=com
cn: basenode
description: Basenode
objectClass: device
objectClass: top
objectClass: puppetClient
puppetclass: base

dn: cn=web,ou=Hosts,dc=example,dc=com
cn: web
description: Webserver
objectClass: device
objectClass: top
objectClass: puppetClient
parentnode: basenode
puppetclass: apache

dn: cn=web1.example.com, ou=Hosts,dc=example,dc=com
cn: web1
description: webserving host
objectclass: device
objectclass: top
objectclass: puppetClient
objectclass: ipHost
parentnode: web
ipHostNumber: 192.168.1.100

This listing includes a default node, a node called basenode, and a template node called web. Each node has particular classes assigned to it, and the web node has the basenode defined as its parent node and thus inherits its classes also. Lastly, we define a client node, called web1, which inherits the web node as a parent.

Summary

In this chapter we've explored how you can use both external node classification and the LDAP node terminus. Both of these allow you to scale to larger numbers of nodes without needing to maintain large numbers of nodes in your manifest files. In Chapter 7, we'll also look at how you can use Puppet Dashboard or the Foreman dashboard as an external node classifier.

Resources

The following links will take you to Puppet documentation related to external nodes:

  • External nodes http://docs.puppetlabs.com/guides/external_nodes.html

  • LDAP nodes http://projects.puppetlabs.com/projects/puppet/wiki/Ldap_Nodes

  • Puppet configuration reference http://docs.puppetlabs.com/references/stable/configuration.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset