You can now move configuration settings to Hiera and dedicate your manifest to logic. This works very seamlessly as far as classes and their parameters are concerned, because class parameters automatically retrieve their values from Hiera. For configuration that requires you to instantiate resources, you still need to write the full manifests and add manual lookup function calls.
For example, an Apache web server requires some global settings, but the interesting parts of its configuration are typically performed in virtual host configuration files. Puppet models them with defined resource types. If you want to configure an iptables
firewall, you have to declare lots of resources of the firewall
type (available through the puppetlabs-firewall
module).
Such elaborate resources can clutter up your manifest, yet they mostly represent data. There is no inherent logic to many firewall rules (although a set of rules is derived from one or several key values sometimes). Virtual hosts often stand for themselves as well, with little or no relation to configuration details that are relevant to other parts of the setup.
Puppet comes with yet another function that allows you to move whole sets of such resources to Hiera data. The pattern is straightforward: a group of resources of the same type are represented by a hash. The keys are resource titles, and the values are yet another layer of hashes with key/value pairs for attributes:
services: apache2: enable: true ensure: running syslog-ng: enable: false
This YAML data represents two service
resources. To make Puppet add them as actual resources to the catalog, pass the hash to the create_resources
function:
$resource_hash = hiera('services', {})
create_resources('service', $resource_hash)
The first argument is the name of the resource type, and the second must be the hash of actual resources. There are some more aspects to this technique, but do note that with Puppet 4, it is no longer necessary to rely on the create_resources
function.
It's useful to be aware of the basics of it anyway. It is still in broad use for existing manifests, and it is still the most compact way of converting data into resources. To learn more, refer to the online documentation at https://docs.puppetlabs.com/references/latest/function.html#createresources
The iteration functions in the Puppet 4 language allow you to implement the preceding transformation and more:
$resource_hash.each |$res_title,$attributes| { service { $res_title: ensure => $attributes['ensure'], enable => $attributes['enable'], } }
This has a couple of advantages over the create_resources
approach:
create_resources
For creating many simple resources (such as the services in the above example), you might wish to avoid create_resource
in Puppet 4 manifests. Just keep in mind that if you don't take advantage of doing so, you can keep the manifest more succinct by sticking to create_resources
after all.
Puppet 4 comes with a useful tool to generate YAML data that is suitable for create_resources
. With the following command, you can make Puppet emit service type resources that represent the set of available services on the local system, along with their current property values:
puppet resource -y service
The -y
switch selects YAML output instead of Puppet DSL.
In theory, these techniques allow you to move almost all your code to Hiera data. (The next section discusses how desirable that really is.) There is one more feature that goes one step further in this direction:
hiera_include('classes')
This call gathers values from all over the hierarchy, just like hiera_array
. The resulting array is interpreted as a list of class names. All these named classes are included. This allows for some additional consolidation in your manifest:
# common.yaml classes: - ssh - syslog ... # role-webserver.yaml classes: - apache - logrotate - syslog
You can possibly even use hiera_include
to declare these classes outside of any node
block. The data will then affect all nodes. Additionally, from some distinct classes, you might also declare other classes via hiera_include
, whose names are stored under a different Hiera key.
The ability to enumerate classes for each node to include is what Puppet's External Node Classifiers (ENCs) had originally been conceived for. Hiera can serve as a basic ENC thanks to the hiera_include
function. This is most likely preferred over writing a custom ENC. However, it should be noted that some open source ENCs such as Foreman are quite powerful and can add much convenience; Hiera has not supplanted the concept as a whole.
The combination of these tools opens some ways for you to shrink your manifests to their essential parts and configure your machines gracefully through Hiera.
You can now move most of the concrete configuration to the data storage. Classes can be included from the manifest or through Hiera. Puppet looks up parameter values in the hierarchy, and you can flexibly distribute the configuration values there in order to achieve just the desired result for each node with minimal effort and redundancy.
This does not mean that you don't write actual manifest code anymore. The manifest is still the central pillar of your design. You will often need logic that uses the configuration data as input. For example, there might be classes that should only be included if a certain value is retrieved from Hiera:
if hiera('use_caching_proxy', false) { include nginx }
If you try and rely on Hiera exclusively, you will have to add nginx
to the classes
array at all places in the hierarchy that set the use_caching_proxy
flag to true
. This is prone to mistakes. What's worse is that the flag can be overridden from true
to false
at a more specific layer, but the nginx
element cannot be removed from an array that is retrieved by hiera_include
.
It is important to keep in mind that the manifest and data should complement each other. Build manifests primarily and add lookup function calls at opportune places. Defining flags and values in Hiera should allow you (or the user of your modules) to alter the behavior of the manifest. The data should not be the driver of the catalog composition, except for places in which you replace large numbers of static resources with large data structures.
To round things off, let's build a complete example of a module that is enhanced with Hiera. Create a demo
module in the environment of your choice. We will go with production
:
# /etc/puppetlabs/code/environments/production/modules/demo/manifests/init.pp class demo( Boolean $auto = false, Boolean $syslog = true, String $user = 'nobody' ) { file { '/usr/local/bin/demo': … } if $auto { cron { 'auto-demo': user => $user, command => '/usr/local/bin/demo' ... } $atoms = hiera('demo::atoms', {}) if $atoms !~ Hash[String, Hash] { fail "The demo::atoms value must be a hash of resource descriptions" } $atoms.each |$title, $attributes| { demo::atom { $title: address => $attributes['address'], port => $attributes['port'], } } }
This class implicitly looks up three Hiera keys for its parameters:
demo::auto
demo::syslog
demo::user
There is also an explicit lookup of an optional demo::atoms
hash that creates configuration items for the module. The data is converted to resources of the following defined type:
# /etc/puppetlabs/code/.../modules/demo/manifests/atom.pp define demo::atom( String $address, Integer[1,65535] $port=14193 ) { file { "/etc/demo.d/${name}": ensure => 'file', content => "--- host: ${address} port: ${port} ", mode => '0644', owner => 'root', group => 'root', } }
The module uses a default of nobody
for user
. Let's assume that your site does not run scripts with this account, so you set your preference in common.yaml
. Furthermore, assume that you also don't commonly use syslog
:
demo::user: automation demo::syslog: false
If this user account is restricted on your guest workstations, Hiera should set an alternative value in role/public_desktop.yaml
:
demo::user: maintenance
Now, let's assume that your cron jobs are usually managed in site modules, but not for web servers. So, you should let the demo
module itself take care of this on web servers through the $auto
parameter:
# role/webserver.yaml demo::auto: true
Finally, the web server, int01-web01.example.net
should be an exception. Perhaps it's supposed to run no background jobs whatsoever:
# int01-web01.example.net.yaml demo::auto: false
Concerning configuration resources, this is how to make each machine add itself as a peer:
# common.yaml demo::atoms: self: address: localhost
Let's assume that the Kerberos servers should not try this. Just make them override the definition with an empty hash:
# role/kerberos.yaml demo::atoms: {}
Here is how to make the database servers also contact the custom server running on the Puppet master machine on a nonstandard port:
# role-dbms.yaml demo::atoms: self: address: localhost master: address: master.example.net port: 60119
With all your Hiera data in place, you can now include the demo
class from your site.pp
(or an equivalent) file indiscriminately. It is often a good idea to be able to allow certain agent machines to opt out of this behavior in the future. Just add an optional Hiera flag for this:
# site.pp if hiera('enable_demo', true) { include demo }
Agents that must not include the module can be given a false
value for the enable_demo
key in their data now.