Looking up a key value in Hiera is easy. Puppet comes with a very straightforward function for this:
$plugins = hiera('reporting::plugins')
Whenever the compiler encounters such a call in the manifest of the current agent node, it triggers a search in the hierarchy. The specific data sources are determined by the hierarchy in your hiera.yaml
file. It will almost always rely on fact values provided by the agent to make flexible data source selections.
If the named key cannot be found in the agent's hierarchy, the master aborts the catalog compilation with an error. To prevent this, it is often sensible to supply a default value with the lookup:
$plugins = hiera('reporting::plugins', [])
In this case, Puppet uses an empty array if the hierarchy mentions no plugins. On the other hand, you can purposefully omit the default value. Just like with class
and define
parameters, this signals that the Hiera value is required. If the user fails to supply it, Puppet will abort the manifest compilation.
You have seen how to invoke the hiera
function for value retrieval. There is really not more to it than what you have seen in the previous section, except for an optional parameter. It allows you to include an additional layer at the top of your hierarchy. If the key is found in the named data source, it will override the result from the regular hierarchy:
$plugins = hiera('reporting::plugins', [], 'global-overrides')
If the reporting::plugins
key is found in the global-overrides
data source, the value is taken from there. Otherwise, the normal hierarchy is searched.
Generally, assigning the retrieved value to a manifest variable is quite common. However, you can also invoke the hiera
function in other useful contexts, such as the following:
@@cacti_device { $::fqdn: ip => hiera('snmp_address', $::ipaddress), }
The lookup result can be handed to a resource directly as a parameter value. This is an example of how to allow Hiera to define a specific IP address per machine that should be used for a specific service. It acts as a simple way to manually override Facter's assumptions.
It is generally safer to store Hiera lookup results in a variable first. This allows you to check their data type. In Puppet 3, you need to use an assert
function from the stdlib
module. Puppet 4 has an operator for this purpose:
$max_threads = hiera('max_threads') if $max_threads !~ Integer { fail "The max_threads value must be an integer number" }
Another frequent occurrence is a parameter default that is made dynamic through a Hiera lookup:
define logrotate::config( Integer $rotations = hiera('logrotate::rotations', 7) ) { # regular define code here }
For logrotate::config
resources that are declared with an explicit parameter value, the Hiera value is ignored:
logrotate::config { '/var/log/cacti.log': rotations => 12 }
This can be a little confusing. Still, the pattern adds some convenience. Most agents can rely on the default. The hierarchy allows you to tune this default on multiple levels of granularity.
The concept of parameterized classes might have gotten a somewhat tarnished reputation, judging from our coverage of it so far. It allegedly makes it difficult to include classes from multiple places in the manifest, or silently allows it under shifting circumstances. While that is true, you can avoid these issues by relying on Hiera for your class parameterization needs.
Since Puppet's Version 3.2, it has been possible to choose the values for any class's parameters right in the Hiera data. Whenever you include a class that has any parameters, Puppet will query Hiera to find a value for each of them. The keys must be named after the class and parameter names, joined by a double colon. Remember the cacti
class from Chapter 5, Extending Your Puppet Infrastructure with Modules? It had a $redirect
parameter. To define its value in Hiera, add the cacti::redirect
key:
# node/cacti01.example.net.yaml cacti::redirect: false
Some classes have very elaborate interfaces—the apache
class from the Puppet Labs Apache module accepts 49 parameters at the time of writing this. If you need many of those, you can put them into the target machine's dedicated YAML file as one coherent block of keys with values. It will be quite readable, because the apache::
prefixes line up.
You don't save any lines compared to specifying the parameters right in the manifest, but at least the wall of options will not get in your way while you're programming in your manifests—you separated data from code.
The point that is perhaps the most redeeming for class parameterization is that each key is independent in your hierarchy. Many parameters can most likely be defined for many or all of your machines. Clusters of application servers can share some settings (if your hierarchy includes a layer on which they are grouped together), and you can override parameters for single machines as you see fit:
# common.yaml apache::default_ssl_cert: /etc/puppetlabs/puppet/ssl/certs/%{::clientcert}.pem apache::default_ssl_key: /etc/puppetlabs/puppet/ssl/private_keys/%{::clientcert}.pem apache::purge_configs: false
The preceding example prepares your site to use the Puppet certificates for HTTPS. This is a good choice for internal services, because trust to the Puppet CA can be easily established, and the certificates are available on all agent machines. The third parameter, purge_configs
, prevents the module from obliterating any existing Apache configuration that is not under Puppet's management.
Let's see an example of a more specific hierarchy layer which overrides this setting:
# role/httpsec.yaml apache::purge_configs: true apache::server_tokens: Minimal apache::server_signature: off apache::trace_enable: off
On machines that have the httpsec
role, the Apache configuration should be purged so that it matches the managed configuration completely. The hierarchy of such machines also defines some additional values that are not defined in the common
layer. The SSL settings from common
remain untouched.
A specific machine's YAML can override keys from either layer if need be:
# node/sec02-sxf12.yaml apache::default_ssl_cert: /opt/ssl/custom.pem apache::default_ssl_key: /opt/ssl/custom.key apache::trace_enable: extended
All these settings require no additional work. They take effect automatically, provided that the apache
class from the puppetlabs-apache
module is included.
For some users, this might be the only way in which Hiera is employed on their master, which is perfectly valid. You can even design your manifests specifically to expose all configurable items as class parameters. However, keep in mind that another advantage of Hiera is that any value can be retrieved from many different places in your manifest.
For example, if your firewalled servers are reachable through dedicated NAT ports, you will want to add this port to each machine's Hiera data. The manifest can export this value not only to the firewall server itself, but also to external servers that use it in scripts and configurations to reach the exporting machine:
$nat_port = hiera('site::net::nat_port') @@firewall { "650 forward port ${nat_port} to ${::fqdn}": proto => 'tcp', dport => $nat_port, destination => hiera('site::net::nat_ip'), jump => 'DNAT', todest => $::ipaddress, tag => hiera('site::net::firewall_segment'), }
The values will most likely be defined on different hierarchical layers. nat_port
is agent-specific and can only be defined in the %{::fqdn}
(or %{::clientcert}
for better security) derived data source. nat_ip
is probably identical for all servers in the same cluster. They might share a server role. firewall_segment
could well be identical for all servers that share the same location:
# stor03.example.net.yaml site::net::nat_port: 12020 ... # role/storage.yaml site::net::nat_ip: 198.58.119.126 ... # location/portland.yaml site::net::firewall_segment: segment04 ...
As previously mentioned, some of this data will be helpful in other contexts as well. Assume that you deploy a script through a defined type. The script sends messages to remote machines. The destination address and port are passed to the defined type as parameters. Each node that should be targeted can export this script resource:
@@site::maintenance_script {"/usr/local/bin/maint-${::fqdn}": address => hiera('site::net::nat_ip'), port => hiera('site::net::nat_port'), }
It would be impractical to do all this in one class that takes the port and address as parameters. You would want to retrieve the same value from within different classes or even modules, each taking care of the respective exports.
Some examples in this chapter defined array values in Hiera. The good news is that retrieving arrays and hashes from Hiera is not at all different from simple strings, numbers, or Boolean values. The hiera
function will return all these values, which are ready for use in the manifest.
There are two more functions that offer special handling for such values: the hiera_array
and hiera_hash
functions.
The presence of these functions can be somewhat confusing. New users might assume that these are required whenever retrieving hashes or arrays from the hierarchy. When inheriting Puppet code, it can be a good idea to double-check that these derived functions are actually used correctly in a given context.
When the hiera_array
function is invoked, it gathers all named values from the whole hierarchy and merges them into one long array that comprises all elements that were found. Take the distributed firewall configuration once more, for example. Each node should be able to export a list of rules that open ports for public access. The manifest for this would be completely driven by Hiera:
if hiera('site::net::nat_ip', false) {
@@firewall { "200 NAT ports for ${::fqdn}":
port => hiera_array('site::net::nat_ports'),
proto => 'tcp',
destination => hiera('site::net::nat_ip'),
jump => 'DNAT',
todest => $::ipaddress,
}
}
Please note that the title "200 NAT ports" does not allude to the number of ports, but just adheres to the naming conventions for firewall
resources. The numeric prefix makes it easy to maintain order. Also, note the seemingly nonsensical default value of false
for the site::net::nat_ip
key in the if
clause. This forms a useful pattern, though—the resource should only be exported if public_ip
is defined for the respective node.
The hierarchy can then hold ports on several layers:
# common.yaml nat_ports: 22
The SSH port should be available for all nodes that get a public address. Note that this value is not an array itself. This is fine; Hiera will include scalar values in the resulting list without any complaint.
# role-webserver.yaml nat_ports: [ 80, 443 ]
Standalone web application servers present their HTTP and HTTPS ports to the public:
# tbt-backend-test.example.net.yaml nat_ports: - 5973 - 5974 - 5975 - 6630
The testing instance for your new cloud service should expose a range of ports for custom services. If it has the webserver
role (somehow), it will lead to an export of ports 22
, 80
, and 443
as well as its individually chosen list.
When designing such a construct, keep in mind that the array merge is only ever-cumulative. There is no way to exclude values that were added in lower layers from the final result. In this example, you will have no opportunity to disable the SSH port 22
for any given machine. You should take good care when adding common values.
A similar alternative lookup function exists for hashes. The hiera_hash
function also traverses the whole hierarchy and constructs a hash by merging all hashes it finds under the given Hiera key from all hierarchy layers. Hash keys in higher layers overwrite those from lower layers. All values must be hashes. Strings, arrays, or other data types are not allowed in this case:
# common.yaml haproxy_settings: log_socket: /dev/log log_level: info user: haproxy group: haproxy daemon: true
These are the default settings for haproxy
at the lowest hierarchy level. On web servers, the daemon should run as the general web service user:
# role/webserver.yaml haproxy_settings: user: www-data group: www-data
When retrieved using hiera('haproxy_settings')
, this will just evaluate to the hash, {'user'=>'www-data','group'=>'www-data'}
. The hash at the role-specific layer completely overrides the default settings.
To get all values, create a merger using hiera_hash('haproxy_settings')
instead. The result is likely to be more useful:
{ 'log_socket' =>'/dev/log', 'log_level' => 'info', 'user' => 'www-data', 'group' => 'www-data', 'daemon' => true }
The limitations are similar to those of hiera_array
. Keys from any hierarchy level cannot be removed, they can only be overwritten with different values. The end result is quite similar to what you would get from replacing the hash with a group of keys:
# role/webserver.yaml haproxy::user: www-data haproxy::group: www-data
If you opt to do this, the data can also be easily fit to a class that can bind these values to parameters automatically. Preferring flat structures can, therefore, be beneficial. Defining hashes in Hiera is still generally worthwhile, as the next section explains.