Chapter 9. Scaling Puppet Infrastructures

There is one thing I particularly like about Puppet: its usage patterns can grow with the user's involvement. We can start using it to explore and modify our system with puppet resource, we can use it with local manifests to configure our machine with puppet apply, and then we can have a central server where a puppet master service provides configurations for all our nodes, where we run the puppet agent command.

Eventually, our nodes' number may grow, and we may find ourselves with an overwhelmed Puppet Master that needs to scale accordingly.

In this chapter, we review how to make our Master grow with our infrastructure and how to measure and optimize Puppet performances. You will learn the following:

  • Optimizing Puppet Master with Passenger
  • Optimizing Puppet Server based on Trapperkeeper
  • Horizontally scaling Puppet Masters
  • Load balancing alternatives
  • Masterless setups
  • Store configs with PuppetDB
  • Profiling Puppet performances
  • Code optimization

Scaling Puppet

Generally, we don't have to care about Puppet Master's performances when we have few nodes to manage.

Few is definitively a relative word; I would say any number lower than one hundred nodes, which varies according to various factors, such as the following:

  • System resources: The bare performances of the system, physical or virtual, where our Puppet Master is running are, obviously, a decisive point. Particularly needed is the CPU, which is devoured by the puppet master process when it compiles the catalogs for its clients and when it makes MD5 checksums of the files served via the fileserver. Memory can be a limit too while disk I/O should generally not be a bottleneck.
  • Average number of resources for node: More resources we manage in a node, the bigger the catalog becomes, and it takes more time to compile it on Puppet Master, to deliver it via network and finally to receive and process the clients' reports.
  • Number of managed nodes: The more nodes we have in our infrastructure, the more work is expected from Puppet Master. More precisely, what really matters for the Master is how many catalogs it has to compile per unit of time. So the number of nodes is just a factor of a multiplication, which also involves the next point.
  • Frequency of Puppet runs for each node: The default 30 minutes, when Puppet runs as a service, may be changed, and has a big impact on the work submitted to the Master.
  • Exported resources: If we use exported resources, we may have a huge impact on performances, especially if we don't use PuppetDB as a backend for storeconfigs.

As simple as puppet apply

The simplest way we can use Puppet is via the apply command. It is simple but powerful, because with a single puppet apply, we can do exactly what a catalog retrieved from the Puppet Master would do on the local node.

The manifest file we may apply can be similar to our site.pp on Puppet Master; we just have to specify modulepath and eventually, the hiera_config parameters will be able to reproduce the same result we would have with a client-server setup:

puppet apply ––modulepath=/etc/puppetlabs/code/modules:/etc/puppetlabs/code/site 
          --hiera_config=/etc/puppetlabs/code/hiera.yaml 
          /etc/puppet/manifests/site.pp

We can mimic an ENC by placing, on our manifest file, all the top scope variables and classes that will be provided by it.

This usage pattern with Puppet is the most simple and direct and, curiously, is also a popular choice in some large installations; later, we will see how a Masterless approach, based on puppet apply, can be an alternative for scaling.

Default Puppet Master

A basic Puppet Master installation is rather straightforward: we just have to install the server package and we have what is needed to start working with Puppet:

  • A puppet master service, which can start without any further configuration
  • Automatic creation of the CA and of the Master certificates
  • Default configurations that involve the following settings:
    • First manifest file processed by the Master in /etc/puppetlabs/code/manifests/site.pp
    • Modules searched in /etc/puppet/modules and /opt/puppetlabs/puppet/modules

Now, we just have to run Puppet on clients with puppet agent -t --server <pupptmaster fqdn> and sign their certificates on the Master (puppet cert sign <client certname>) to have a working client / server environment.

We can work with such a setup if we have no more than a few dozen nodes to manage.

We have already seen the elements that affect the Puppet Master's resources, but there is another key factor that should interest us: what are our acceptable catalog compilation and application times?

Compilation occurs on the Puppet Master and can last from few seconds to minutes; it is heavily affected by the number of resources and relationships to manage, but also, obviously, by the load on the Puppet Master, which is directly related to how frequently it has to compile catalogs.

If our compilation times are too long for us, we have to verify the following conditions:

  • Compilation time is always long even with a single catalog processed at a time. In this case, we will have to work on two factors: code optimization and CPU power. Our manifests may be particularly resourceful and we have to work on how to optimize our code (we will see how later). We can also provide more CPU power to our Puppet Master as that is the most needed resource during compilation. Of course, we should verify its overall sanity: it shouldn't regularly swap memory pages to disk and have not faulty hardware that might affect performance. If we use stored configs, we should definitively use PuppetDB as the backend, either on the same server or in a dedicated one. We may also consider upgrading our Puppet version, especially if we are not using Puppet 4 yet.
  • Compilation time takes a long time because many concurrent catalogs are processed at the same time. Our default Puppet Master setup can't handle the quantity of nodes that interrogate it. Many options are available in this case. We order them by ease of implementation:
    • Reduce the frequency of each Puppet run (the default 30 minutes interval may be longer, especially if we have a way to trigger Puppet runs in a centrally managed way, for example, via MCollective, so that we can easily force urgent runs).
    • If using a version older than Puppet 4, use Apache Passenger instead of the default web server.
    • Have a multi Master setup with load-balanced servers.

Puppet Master with Passenger

Passenger, known also as mod_rails or mod_passenger, is a fast application server that can work as a module with Apache or Nginx to serve Ruby, Python, Node.js, or Meteor web applications. Before version 4 and some of the latest 3.X versions, Puppet was a pure Ruby application that used HTTPS for client / server communication, and it could gain great benefits by using Passenger, instead of the default and embedded Webrick, as a web server.

The first element to consider when using older Puppet versions and there is the need to scale the Puppet Master is definitely the introduction of Passenger. It brings a pair of major benefits that are listed as follows:

  • General better performances in serving HTTP requests (either via Apache or Nginx, which are definitively more efficient than Webrick)
  • For Multi CPU support. On a standalone Puppet Master, there is just one process that handles all the connections, and that process uses only one CPU. With Passenger, you can have more concurrent processes that better use all the available CPUs.

On modern systems, where multiprocessors are the rule and not the exception, this leads to huge benefits.

Installing and configuring Passenger

Let's quickly see how to install and configure Passenger, using plain Puppet resources.

For the sake of brevity, we simulate an installation on a RedHat 6 derivative here. For other breeds, there are different methods to set up the source repo for packages, and possibly different names and paths for resources.

The following Puppet code can be placed on a file such as setup.pp and be run with puppet apply setup.pp.

First of all, we need to setup the EPEL repo, which contains extra packages for RedHat Linux that we need:

yumrepo { 'epel':
  mirrorlist => 'http://mirrors.fedoraproject.org/mirrorlist?repo=epel-6&arch=$basearch',
  gpgcheck   => 1,
  enabled    => 1,
  gpgkey     => 'https://fedoraproject.org/static/0608B895.txt',
}

Then, we set up the Passenger's upstream yum repo of Stealthy Monkeys:

yumrepo { 'passenger':
  baseurl    => 'http://passenger.stealthymonkeys.com/rhel/$releasever/$basearch',
  mirrorlist => 'http://passenger.stealthymonkeys.com/rhel/mirrors',
  enabled    => 1,
  gpgkey     => 'http://passenger.stealthymonkeys.com/RPM-GPG-KEY-stealthymonkeys.asc',
}

We will then install all the required packages with the following code:

package { [ 'mod_passenger' , 'httpd' , 'mod_ssl' , 'rubygems']:
  ensure => present,
}

Since there is not a native RPM package, we install rack, a needed dependency, as a Ruby Gem:

package { 'rack':
  ensure   => present,
  provider => gem,
}

We need also to configure an Apache virtual host file:

file { '/etc/httpd/conf.d/passenger.conf':
  ensure  => present,
  content => template('puppet/apache/passenger.conf.erb')
}

In our template ($modulepath/puppet/templates/apache/passenger.conf.erb would be its path for the previous sample), we need different things configured. The basic Passenger settings, which can eventually be placed in a dedicated file are as follows:

PassengerHighPerformance on
PassengerMaxPoolSize 12 # Lower this if you have memory issues
PassengerPoolIdleTime 1500
PassengerStatThrottleRate 120
RackAutoDetect On
RailsAutoDetect Off

Then, we configure Apache to listen to the Puppet Master's port 8140 and create a Virtual Host on it:

Listen 8140
<VirtualHost *:8140>

On the Virtual Host, we terminate the SSL connection. Apache must behave as a Puppet Master when clients connect to it, so we have to configure the paths of the Puppet Master's SSL certificates as follows:

  SSLEngine on
  SSLProtocol1all -SSLv2 -SSLv3
  SSLCipherSuite ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS
  SSLCertificateFile /var/lib/puppet/ssl/certs/<%= @fqdn %>.pem
  SSLCertificateKeyFile /var/lib/puppet/ssl/private_keys/<% @fqdn %>.pem
  SSLCertificateChainFile /var/lib/puppet/ssl/certs/ca.pem
  SSLCACertificateFile /var/lib/puppet/ssl/certs/ca.pem
  SSLCARevocationFile /var/lib/puppet/ssl/certs/ca_crl.prm
  SSLVerifyClient optional
  SSLVerifyDepth 1
  SSLOptions +StdEnvVars

Note

A good reference for recommended values for SSLCipherSuite can be found at https://mozilla.github.io/server-side-tls/ssl-config-generator/.

We also need to add some extra HTTP headers to the connection that is made to the Puppet Master in order to let it identify the original client (details on this later):

RequestHeader set X-SSL-Subject %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-DN %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-Verify %{SSL_CLIENT_VERIFY}e

Then, we enable Passenger and define a document root where we will create the rack environment to run Puppet:

  PassengerEnabled On
  DocumentRoot /etc/puppet/rack/public/
  RackBaseURI /
  <Directory /etc/puppet/rack/public/>
    Options None
    AllowOverride None
    Order allow,deny
    allow from all
  </Directory>

Finally, we add normal logging directives as follows:

  ErrorLog /var/log/httpd/passenger-error.log
  CustomLog /var/log/httpd/passenger-access.log combined
</VirtualHost>

Then, we need to create the rack environment working directories and configuration as follows:

file { ['/etc/puppet/rack',
        '/etc/puppet/rack/public',
        '/etc/puppet/rack/tmp']:
  ensure => directory,
  owner  => 'puppet',
  group  => 'puppet',
}
file { '/etc/puppet/rack/config.ru':
  ensure => present,
  content => template('puppet/apache/config.ru.erb')
  owner  => 'puppet',
  group  => 'puppet',
}

In our config.ru, we need to instruct rack on how to run Puppet as follows:

# if puppet is not in your RUBYLIB:
# $LOAD_PATH.unshift('/opt/puppet/lib')
$0 = "master"
# ARGV << "--debug" # Uncomment to debug
ARGV << "--rack"
ARGV << "--confdir" << "/etc/puppet"
ARGV << "--vardir"  << "/var/lib/puppet"
require 'puppet/util/command_line'
run Puppet::Util::CommandLine.new.execute

Once things are configured, we can start our Apache. However, before doing this, we need to disable the standalone Puppet Master service as it listens to the same 8140 port and would overlap with our Apache service:

service { 'puppetmaster':
  ensure => stopped,
  enable => false,
}

Then, we can finally start our Apache with Passenger. Remember that whenever we make changes to Puppet's configuration, the service to restart to apply them is Apache: Puppet Master standalone process should remain stopped:

service { 'httpd':
  ensure => running,
  enable => true,
  require => Service['puppetmaster'], # We start apache after having managed the puppetmaster service shutdown
}

All this code, with the ERB templates it uses, should be placed in a module that allows autoloading of classes and files.

Puppet Master based on Trapperkeeper

One of the major changes in Puppet version 4 is that the Puppet Server is executed over a Java Virtual Machine. Ruby implementation was fine while it offered an agile development environment in the initial versions of the project, but as the software consolidates and Puppet language is more mature and stable, a better performance and improvements in scalability and speed were required.

Reimplementing a whole application to change the language doesn't seem to be a good idea, it is a huge effort that could block the evolution of the project. That's why, wisely, the Puppet Labs team decided to do it in a way that allowed them to reimplement just some of the parts at a time. JVM allows to execute the Ruby code with JRuby, so their first step was to make sure that Puppet worked with this interpreter while they implemented a services framework for JVM that could serve as a glue for the different parts implemented in different languages. This framework is known as Trapperkeeper and is implemented in Clojure.

This Trapperkeeper-based implementation offers by now two basic points of performance tuning: controlling the maximum number of JRuby instances running at a time, and controlling the memory usage of the whole application.

Puppet Server maintains a pool of JRuby instances. When it needs to execute some Ruby code, it picks up one of these instances till it is finished, and then releases it. If a request needs an instance and there are none available, then the request is blocked till one is released. So, if there are few instances, you can suffer a lot of contention in the requests to your Puppet Server, but also, with lots of them, the server can get overloaded. It's important to choose a good number of instances for your deployment.

The maximum number of instances can be controlled with the max-active-instances variable in the puppetserver.conf file. The default value of this setting (leaving it commented out in the configuration) makes the Puppet Server select a safe value based on the number of CPUs; but, depending on your deployment, you can see that the CPUs of your servers are underused, or that it's overloaded if it has more responsibilities. In that case, you can decide to evaluate some other values to see which one makes better use of your resources.

You also have to take into account the memory usage of the application. The more JRuby instances it has and the bigger your catalogs are, the more memory it will need. A recommendation is to assign 512 MB as a base and an additional 512 MB for each JRuby instance. If your Puppet catalogs are very big, or if your servers have spare memory to dedicate to Puppet Server, you may consider to increase the available memory. This setting has to be configured in the JVM start-up, with the parameters -Xms and -Xmx that respectively control the minimum and the maximum heap size. In a JVM most of the memory used is in the heap, but it will also need a little more memory, so leave some margin with the maximum available memory in the server. This value used to be configured in the defaults file (/etc/sysconfig/puppetserver or /etc/defaults/puppetserver depending on the distribution). For example, for a server with four instances in a server with 4 GB memory, and by applying the recommendation we could set it to 2560 MB, but it'd probably be safe to set it to 3 GB; a very adjusted value could trigger too many times the garbage collector, what penalizes CPU performance. This would be the setting for the defaults file:

JAVA_ARGS="-Xms3072m -Xmx3072m -XX:MaxPermSize=256m"

You can see that the MaxPermSize is also set; this limits the memory size of the permanent space, that is where the JVM stores classes, methods, and so on. Of course, any other settings available for the JVM could be used here.

Multi-Master scaling

A Puppet Server running on a decently sized server (ideal at least 4 CPUs and 4 GB of memory) should be able to cope with hundreds of nodes.

When this number starts to enter in to the range of thousands, or the compiled catalogs start to become big and complex, a single server will begin to have problems in handling all the traffic. Then, we need to scale horizontally, adding more Puppet Masters to manage clients' requests in a balanced way.

There are some issues to manage in such a scenario, the most important ones are:

  • How to manage the CA and the server certificates
  • How to manage SSL termination
  • How to manage Puppet code and data

Managing certificates

Puppet's certificates are issued by a Certificate Authority, which is automatically created on the server when we start it the first time. We usually don't care much about it, we just sign certificate requests with puppet cert and have everything we need to work with clients.

On a multi-Master setup, an accurate management of the Puppet Certification Authority and of the Puppet Masters' certificates becomes essential.

The main element to consider is that the first time puppet master is executed, it automatically creates two different certificates, which are as follows:

  • The CA certificate is used to sign all the other certificates:
    • The public key is stored in /etc/puppetlabs/puppet/ssl/ca/ca_pub.pem
    • The private key is in /etc/puppetlabs/puppet/ssl/ca/ca_key.pem
    • The certificate file is in /etc/puppetlabs/puppet/ssl/ca/ca_crt.pem
  • The Puppet Server's own host certificate is used to communicate with clients:
    • The public key is stored in /etc/puppetlabs/puppet/ssl/public_keys/<fqdn>
    • The private key is stored in /etc/puppetlabs/puppet/ssl/private_keys/<fqdn>; the same paths are used on clients for their own certificates

On the Puppet Master, all the clients' public keys that still need to be signed by the CA are placed in /etc/puppetlabs/puppet/ssl/ca/requests, and the ones that have been signed are in /etc/puppetlabs/puppet/ssl/ca/signed.

The CA, which is managed via the puppet ca command, performs the following functions:

  • Signs Certificate Signing Requests (CSR) from clients and transforms them in x509v3 certificates (when we issue puppet cert sign <certname>)
  • Manages the Certificate Revocation List (CRL) of certificates we revoke with the puppet cert revoke <certname>
  • Authenticates Puppet clients and Masters making them establish a trust relationship and communicate over SSL

There are a pair of important parameters that are related to certificates that should be considered in puppet.conf before launching the Puppet Master for the first time:

  • dns_alt_names: This allows us to define a comma-separated list of names with which a node can be referred when using its certificate. By default, Puppet creates a certificate that automatically adds the names puppet and puppet.$domain to host's fqdn. We should be sure to have in this list of names both the local server hostname and the name the clients used to refer to the Puppet Master (probably associated with the IP of a load balancer).
  • ca_ttl: This sets the duration, in seconds, of the certificates signed by the CA. The default value is 157680000, which means that after 5 years of starting your Puppet Master, its certificate expires and has to be reissued. This is an experience that most of us have already faced and involves the recreation and signing of all their certificates.

Note

Note that the whole /etc/puppetlabs/puppet/ssl directory and the certificates it contains are recreated from scratch if it doesn't exist when Puppet runs. Therefore, if we want to recreate our Puppet Master's certificates with corrected settings, we have to move the existing ssldir to a backup place (just as a precaution in case we change the idea, we won't need it anymore otherwise), configure puppet.conf as needed, and restart the Puppet Master service.

This is an activity that we can do light heartedly, on the Master, only when it has been just installed and there are no (or few) signed clients because when we recreate ssldir with new certificates on the Master communication with clients won't be possible. All the previously signed certificates are no longer valid and have to be recreated.

CA management in a multi-master setup can be done in the following different ways:

  • Configure one of the load balanced Puppet Masters as the CA server, and have all the other ones using it for CA activities. In this case, all the servers act as Puppet Servers and one of them also as the CA.
  • Configure an external, eventually in High Availability, Puppet Master that provides only the CA service and is not used to compile clients' catalogs.

On puppet.conf, configuration is quite straightforward when the CA server is (or might be) different from the Puppet Master:

  • On all the clients, explicitly set the ca_server hostname (which is, by default, the same Puppet Master):
    [agent]
      ca_server = puppetca.example42.com
  • On the CA server, no particular configuration is needed:
    [master]
      certname = puppetca.example42.com
      ca = true
  • On the other Puppet Masters, we just have to define that the local server is not a CA and to look, as done for all the clients, for another ca_server:
    [agent]
      ca_server = puppetca.example42.com
    [master]
      certname = puppet01.example42.com
      ca = false

Managing SSL termination

When we deal with Puppet's client server traffic, we can apply all the logics that are valid for HTTPS connections. We can, therefore, have different scenarios as follows:

  • Clients' proxy (clients can use a proxy to reach remote or not directly accessible Puppet Masters)
  • Master's reverse proxy (all clients communicate with frontend servers that proxy their requests to backend workers)
  • Load balanced Masters at the network level (clients communicate directly with a load balanced server)
  • Load balanced Master at application level (clients communicate with an intermediate host that balances and reverse proxies the Master)

When configuring the involved elements, we have to take care of the following elements:

  • The SSL certificates used where the SSL connection is terminated must always be the ones of the Puppet Master and of the CA. If they are on different servers, we need to copy them.
  • We have to communicate the client's name to the Puppet Master, and this is done by setting, where SSL is terminated, these HTTP headers: X-SSL-Subject, X-Client-DN, and X-Client-Verify, which indicate to the Master if the certificate is authenticated.

In our puppet.conf file, there are always the following default settings, which define the name of the HTTP header (with an HTTP_ prefix and underscores instead of dashes).

They contain the clients' SSL Distinguished Name (DN) and the name of the HTTP header that contains the status message of the client verification (expected value for a trusted, not revoked, client certificate is SUCCESS):

ssl_client_header = HTTP_X_CLIENT_DN
ssl_client_verify_header = HTTP_X_CLIENT_VERIFY

On the web server(s) where SSL is terminated (it might be Passenger in a single server setup or an Apache, which balances and reverse proxies the Puppet Master backend), we need to set these HTTP headers extracting info from SSL environment variables as follows:

RequestHeader set X-SSL-Subject %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-DN %{SSL_CLIENT_S_DN}e
RequestHeader set X-Client-Verify %{SSL_CLIENT_VERIFY}e

These servers are the ones that communicate directly with clients and terminate the SSL connection, we can define them as frontend servers, they act as proxy and generate a new connection to backend Puppet Masters that do the real work and compile catalogs.

Since SSL has been terminated on the frontends, traffic from them to backend servers is in clear text (they are supposed to be in the same LAN), and on the backend Apache, we need to state where to get the client's certificates DN, using the previous extra headers:

SetEnvIf X-Client-Verify "(.*)" SSL_CLIENT_VERIFY=$1
SetEnvIf X-SSL-Client-DN "(.*)" SSL_CLIENT_S_DN=$1

Also, on a backend server, we do not need to configure all the other SSL settings, and just need a Virtual Host with rack configurations.

Given this information, we can compose our topology of web servers that handle Puppet traffic in a very flexible way, with one or more frontend servers that proxy requests to the backend Puppet Masters and terminate SSL, and with backend Puppet Masters that run Puppet via Passenger.

Managing code and data

Deployment of Puppet code and data is another factor to consider. We probably want the same code deployed on all our Puppet Masters. We can do this in various ways: all of them basically require the remote and/or triggered execution of some commands (if we want to avoid the need to log into each server every time a change on Puppet is done) or a way to keep files synced across different servers.

How a deployment script or command may work is definitively tied to how we manage our code: we might execute r10k or librarian puppet, or make a git pull on our local directories to fetch changes from a central repo.

Alternatively, we might decide to have our Puppet code and data on a shared file system or keep them synced with tools such as rsync.

In any case, we have to copy/sync or share all the directories where our code and data: the modules, manifest, and Hiera directories, if used, are placed.

Load balancing alternatives

When we have to balance a pool of Puppet Masters, we have different options, which are as follows:

  • HTTP load balancing with SSL termination is done on the load balancer, which then proxies clients' requests to the Puppet Masters.
  • TCP load balancing with SSL termination is done on the Puppet Masters that directly communicate with clients. In this case, the load balancer (which may be a software like haproxy or a dedicated network equipment) listens to the Virtual IP used by the clients to contact the Master (for example, a common puppet.example42.com). It then redirects all the TCP connections to the Puppet Masters (in their dns_alt_names they need to have the name of the Puppet Master host configured on clients).
  • DNS round robin can be considered the poor man's alternative to TCP load balancing. Clients are configured to use a single hostname for the Puppet Master, which is resolved, via DNS, to the multiple Masters' addresses. Also in this case, SSL connections are terminated directly on the Masters and they must have the name used by clients in their dns_alt_names. This solution is quite easy to implement, as it does not require additional systems to manage load balancing, but has the (major) drawback of not being able to detect failures and remove non-responding Puppet Masters from the pool of balanced servers.
  • DNS SRV records can also be used to define the IP addresses of the Puppet Masters via DNS, allowing the possibility to set priorities and fail overs. This feature is available only on Puppet 3 and later. To use this option, instead of using the server parameter in puppet.conf, we have to indicate the srv_domain this way:
    [main]use_srv_records = true srv_domain = example.com

Note

DNS SRV records are used to define the hostnames and ports of servers that provide specific services. They can also set priorities and weights for the different servers. For example, for Puppet, the following records could be used:

_x-puppet._tcp.example.com. IN SRV 0 5 8140 p1.example.com.
_x-puppet._tcp.example.com. IN SRV 0 5 8140 p2.example.com.

Clients need to explicitly support these records in order to use these kind of configurations.

Masterless Puppet

An alternative approach to the Puppet Master scaling methods that we have seen so far is not to use it at all. Masterless setups involve the direct execution of puppet apply on each node, where all the needed Puppet code and data has to be stored.

In this case, we have to find a way to distribute our modules, manifests, and, eventually, Hiera data to all the clients. We still can use external components such as:

  • ENC: The external_nodes script can work as it works on the Puppet Master; it can interrogate any external source of knowledge on how to classify nodes. A concern here is whether it makes sense to introduce a central point of authority when we want a distributed decentralized setup.
  • Report: Also, the reporting function can work exactly as it works on Puppet Master. Here, as for the ENC, the basic difference is that whatever the tool used, it must allow access from any node in our infrastructure, and not just the Master.
  • Exported Resources: This can be used too, with some caveats. If we use the active records' backend, we need to access the database from all the nodes. If we use PuppetDB, we need to establish a trust between the certificates of the PuppetDB server and the ones of each client.

We also need a way to run Puppet on the clients in an automated or centrally managed way; it may be via a cron job or a remote command execution.

Distribution of Puppet code and data may be done in different ways, as follows:

  • Executing git pull from central repositories
  • Update of native packages (rpm, deb and so on) from a custom repo
  • Running a command such as sync or rdiff
  • Mount from NFS or another network or shared filesystem
  • Bit Torrent with tools such as Murder (https://github.com/lg/murder)

Configurations and infrastructure optimizations

Whatever the layout of our Puppet infrastructure, we may consider some other options to optimize its performances.

Traffic compression

A first quick attempt may be done by activating the compression of HTTPS traffic between clients and Master. The following option has to be set on puppet.conf at both ends:

http_compression = true

The case where it makes sense to enable it is mostly where we have clients that reach the server via a WAN link, generally via a VPN, where throughput is definitely not the one we have on LAN communications. If we have large catalogs and reports, their compression during transfer, being mostly text files, can be quite effective.

Caching

Another area where we might operate is catalog caching. This is a delicate topic, as it is not easy to determine what has changed on the client's side (some facts like uptime change always by definition, others are supposed to be more stable) and on the server's side (changes on the Puppet code and data may or may not affect a specific node). The challenge therefore, is to always provide the correct and updated catalog when a caching mechanism is in place.

Puppet provides some configuration options to manage caching. By default, Puppet doesn't recompile the catalog if it has a local version cached with an updated timestamp and facts which have not changed. When we want to be sure to obtain a new catalog, we have to enable the ignorecache option:

ignorecache = true # Default: false

Note that this is automatically done when we run the puppet agent -t command, which ensures that we have always a freshly compiled catalog.

We can also tell to the client to always use a local cached copy of the catalog, instead of asking it to the Puppet Master:

use_cached_catalog = true # Default: false

This might be useful in cases where we want to temporarily freeze the configurations applied to a client without having to disable the Puppet service and without caring about eventual changes on the Puppet Master.

Distributing Puppet run times

If we run Puppet via cron or other time based mechanism, we need to avoid the problem of having all our clients hitting the Master and requesting their catalog at the same time. There are various options to distribute Puppet runs in order to avoid peaks of too many concurrent requests.

We can introduce a random sleep delay in the command we execute via cron, for example with cron entries based on ERB templates, such as:

0,30 * * * * root sleep <%= @sleep &> ; puppet agent --onetime

Where the $sleep variable with the number of seconds to wait may be randomly defined in Puppet manifests with the fqdn_rand() function, which returns a random value based on the node's full hostname (so it's random (not in a cryptographically usable way), but doesn't change at every catalog compilation):

$sleep = rqdn_rand('1800') # Returns a number from 0 to 1800

Alternatively, we can use the splay configuration option in puppet.conf, which introduces a random (but consistent) delay at every Puppet run, and which can be as long as defined by splaylimit (whose sane default is Puppet's run interval):

splay = true # Default: false
splaylimit = 1h # Default: $runinterval

File system changes check

On Puppet Master, there is an option, filetimeout, which sets the minimum time to wait between checking for updates in configuration files (manifests, templates, and so on). This determines how quickly the Master checks whether a file is changed on disk.

The default value is 15 seconds, and can be changed in puppet.conf.

This setting has very limited effects on the performances (unless, I suppose, we lower it too much), but it's important to know that it exists, because it's the reason why, sometimes, nothing new happens on the client when we launch a Puppet run immediately after a change on some file of Puppet Master.

This may lead to some confusion, we make a change on some manifests, we run Puppet, and nothing happens. Then, we run Puppet again and the change is finally received and we wonder what the hell is happening. Therefore, be aware that there is such an option and, more importantly, be aware of this behavior of the Master that scans the directories where our Puppet code and files are placed at regular intervals and might not immediately process the very latest changes made to these files.

Scaling stored configs

We have seen that the usage of exported resources allows resources declared on a node to be applied on another node. In order to achieve this, Puppet needs the storeconfigs option enabled and this involves the usage of an external database where all the information about the exported resources is stored.

The usage of stored configs has been historically a big performance killer for Puppet. The amount of database transactions involved for each run makes it a quite resource intensive activity.

There are various options in puppet.conf that permit us to tune our configurations. The default settings are as follows:

storeconfigs = false
storeconfigs_backend = active_record
dbadapter = sqlite3
thin_storeconfigs = false

If we enable them with storeconfigs = true, the default configuration involves the usage of the active_record backend and a SQLite database.

This is a solution that performs quite badly and therefore should be used only in test or small environments. It has the unique benefit that we don't need any other activity, we just have to install the SQLite Ruby bindings package on our system. With such a setup, we will quickly have access problems to the SQL backend with multiple concurrent Puppet runs.

The next step is to use a more performant backend for data persistence. Before the introduction of PuppetDB, MySql was the only alternative. In order to enable it, we have to set the following options in puppet.conf:

dbadapter = mysql
dbname = puppet         # Default value
dbserver = localhost    # Default value
dbuser = puppet         # Default value
dbpassword = puppet     # Default value

Such a setup involves a local MySQL server where we have created a puppet database with the relevant grants, so from our MySQL console, we should write something like the following code:

create database puppet;
GRANT ALL ON puppet.* to 'puppet'@'localhost' IDENTIFIED by 'puppet';
flush privileges;

This is enough to have a Puppet Master storing its data on a local MySQL backend. If the load on our system increases, we can move the MySQL service to another dedicated server and can tune our MySQL server.

Brice Figureau, who heavily contributed to the original store configs code, made an interesting presentation at the first Puppet Camp on this topic at http://www.slideshare.net/masterzen/all-about-storeconfigs-2123814, where useful hints are provided to configure MySQL in a dedicated server to scale for the inserts:

innodb_buffer_pool_size = 70% of physical RAM
innodb_log_file_size = up to 5% of physical RAM
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2

Also, to optimize the most common queries on Puppet's Wiki, it is suggested that this index is created from the MySQL console as follows:

use database puppet;
create index exported_restype_title on resources (exported, restype, title(50));

We can limit the amount of information stored by setting thin_storeconfigs = true. This makes Puppet store just facts and exported resources on the database and not the whole catalog and its related data. This option is useful with the active_record backend (with PuppetDB it is not necessary).

What we have written so far about store configs using the active records backend made a lot of sense some years ago, and we referenced it here to have a view on how to scale with store configs. Truth is that the best and recommended way to use store configs is via the PuppetDB backend, this is done by placing these settings in puppet.conf:

storeconfigs = true
storeconfigs_backend = puppetdb

We have dedicated the whole of Chapter 3, Introducing PuppetDB to PuppetDB because it is definitively a major player in the Puppet ecosystem. The performance improvements it brings are huge so there is really no reason not to use it.

The components of PuppetDB can be distributed to scale better:

  • PuppetDB can be horizontally scaled. It's a stateless service entirely based on a REST like interface. Different PuppetDB servers can be load balanced either at TCP or HTTP level.
  • PostgreSQL server may be moved to a dedicated host and then scaled or configured in the high availability mode following PostgreSQL's best practices.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset