By now, we have realized how comprehensive the amount of information stored on PuppetDB is, as it provides a complete view of all our nodes facts, catalogs, and reports. This is useful for a review of what happens on our infrastructure and for the metrics we can extract via queries on all the resources managed by Puppet, but that's not enough.
One of Puppet's limitations, the fact that a node basically has knowledge only about itself via its catalog, and can "interact" with other nodes only via exported resources, would be wiped out if it were possible to make all PuppetDB data at our disposal when compiling a catalog for a node.
Well, this is possible, and can be easily done via Eric Dalén's puppetdbquery
module.
Consider it the key that opens PuppetDB wonders to our Puppet code. It provides the following:
This module enables PuppetDB integration to the next level.
We can install it from the Forge:
puppet module install dalen-puppetdbquery
The puppetdbquery
module uses a custom format for queries which is different (and easier to use) from the native one. All the queries we can do with this module are in the following format:
Type[Name]{attribute1=foo and attribute2=bar}
By default, they are made on normal resources, using the @@
prefix to query exported resources.
The comparison operators are: =
, !=
, >
, <
and ~
(regexp matching).
The expressions can be combined with and
, not
, and or
.
The module introduces, as a Puppet face, the query
command that allows direct interaction with PuppetDB from the command line, for inline help:
puppet help query
To search for all the RedHat family nodes with version 6 we can type the following:
puppet query nodes '(osfamily=RedHat and lsbmajdistrelease=6)'
The same query done on facts shows all the facts for the resulting nodes:
puppet query facts '(osfamily=RedHat and lsbmajdistrelease=6)'
To show only the IP address of the queried nodes:
puppet query facts --facts ipaddress '(osfamily=RedHat and lsbmajdistrelease=6)'
The functions provided by the module can be used inside manifests to populate the catalog with data retrieved from PuppetDB.
query_nodes
has two arguments: the query to use and (optional) the fact to return (by default it provides the certname
). It returns an array, as follows:
$webservers = query_nodes('osfamily=Debian and Class[Apache]') $webserver_ip = query_nodes('osfamily=Debian and Class[Apache]', ipaddress)
query_facts
requires two arguments too: the query to use to discover nodes and the list of facts to return for them. It returns a nested hash in JSON format, as follows:
query_facts('Class[Apache]{port=443}', ['osfamily', 'ipaddress'])
These functions are dramatically useful to retrieve data from PuppetDB and provide resources on a node according to resources in catalogs compiled for other nodes.
Another powerful feature of the puppetdbquery
module is the presence of a Hiera backend that allows us to use PuppetDB data for our Hiera keys.It requires at least one other backend, so it's configured as follows:
--- :backends: - yaml - puppetdb :hierarchy: - nodes/%{fqdn} - common
The fun begins when we can use query in our keys. Instead of something like:
ntp::servers: - 'ntp1.example.com' - 'ntp2.example.com'
We can have a dynamic query like:
ntp::servers::_nodequery: 'Class[Ntp::Server]'
This returns an array with the certname of all the nodes in our infrastructure that have the ntp::server
class. If we want their IP addresses, instead (the same applies for any other fact) use:
ntp::servers::_nodequery: ['Class[Ntp::Server]', 'ipaddress']
The above can also be written with this format (the result is the same):
ntp::servers::_nodequery: query: 'Class[Ntp::Server]' fact: 'ipaddress'