Docs: Hiera 1: Complete Example


Hiera 1: Complete Example

In this example, we’ll use the popular Puppet Labs ntp module, an exemplar of the package/file/service pattern in common use among Puppet users. We’ll start simply, using Hiera to provide the ntp module with parameter data based on particular nodes in our organization. Then we’ll use Hiera to assign the ntp class provided by the module to specific nodes.

What Can We Do With Hiera?

Let’s get started by looking at the ntp module. It does all of its work in a single ntp class, which lives in the init.pp manifest. The ntp class also evaluates some ERB templates stored in the module’s template directory. So, what can we do with Hiera?

Express Organizational Information

The ntp class takes five parameters:

  • servers
  • restrict
  • autoupdate
  • enable
  • template

Most of these parameters reflect decisions we have to make about each of the nodes to which we’d apply the ntp class: Can it act as a time server for other hosts? (restrict), which servers should it consult? (servers), or should we allow Puppet to automatically update the ntp package or not? (autoupdate).

Without Hiera, we might find ourselves adding organizational data to our module code as default parameter values, reducing how shareable it is. We might find ourselves repeating configuration data in our site manifests to cover minor differences in configuration between nodes.

With Hiera, we can move these decisions into a hierarchy built around the facts that drive these decisions, increase sharability, and repeat ourselves less.

Classify Nodes With Hiera

We can also use Hiera to assign classes to nodes using the hiera_include function, adding a single line to our site.pp manifest, then assigning classes to nodes within Hiera instead of within our site manifests. This can be a useful shortcut when we’re explicitly assigning classes to specific nodes within Hiera, but it becomes very powerful when we implicitly assign classes based on a node’s characteristics. In other words, we’ll show you how you don’t need to know the name of very VMWare guest in your organization to make sure they all have a current version of VMWare Tools installed.

Describing Our Environment

For purposes of this walkthrough, we’ll assume a situation that looks something like this:

  • We have two ntp servers in the organization that are allowed to talk to outside time servers. Other ntp servers get their time data from these two servers.
  • One of our primary ntp servers is very cautiously configured — we can’t afford outages, so it’s not allowed to automatically update its ntp server package without testing. The other server is more permissively configured.
  • We have a number of other ntp servers that will use our two primary servers.
  • We have a number of VMWare guest operating systems that need to have VMWare Tools installed.

Our Environment Before Hiera

How did things look before we decided to use Hiera? Classes are assigned to nodes via the puppet site manifest (/etc/puppet/manifests/sites.pp for Puppet open source), so here’s how our site manifest might have looked:

	node "kermit.example.com" {
	  class { "ntp":
		servers    => [ '0.us.pool.ntp.org iburst','1.us.pool.ntp.org iburst','2.us.pool.ntp.org iburst','3.us.pool.ntp.org iburst'],
		autoupdate => false,
		restrict => false,
		enable => true,
	  }
	}

	node "grover.example.com" {
	  class { "ntp":
		servers    => [ 'kermit.example.com','0.us.pool.ntp.org iburst','1.us.pool.ntp.org iburst','2.us.pool.ntp.org iburst'],
		autoupdate => true,
		restrict => false,
		enable => true,
	  }
	}

	node "snuffie.example.com", "bigbird.example.com", "hooper.example.com" {
	  class { "ntp":
		servers    => [ 'grover.example.com', 'kermit.example.com'],
		autoupdate => true,
		restrict => true,
		enable => true,
	  }
	}

Configuring Hiera and Setting Up the Hierarchy

All Hiera configuration begins with hiera.yaml. You can read a full discussion of this file, including where you should put it depending on the version of Puppet you’re using. Here’s the one we’ll be using for this walkthrough:

---
:backends:
  - json
:json:
  :datadir: /etc/puppet/hiera
:hierarchy:
  - node/%{::fqdn}
  - common

Step-by-step:

:backends: tells Hiera what kind of data sources it should process. In this case, we’ll be using JSON files.

:json: configures the JSON data backend, telling Hiera to look in /etc/puppet/hiera for JSON data sources.

:hierarchy: configures the data sources Hiera should consult. Puppet users commonly separate their hierarchies into directories to make it easier to get a quick top-level sense of how the hierarchy is put together. In this case, we’re keeping it simple:

  • A single node/ directory will contain any number of files named after some node’s fqdn (fully qualified domain name) fact. (E.g. /etc/puppet/hiera/node/grover.example.com.json) This lets us specifically configure any given node with Hiera. Not every node needs to have a file in node/ — if it’s not there, Hiera will just move onto the next hierarchy level.
  • Next, the common data source (the /etc/puppet/hiera/common.json file) will provide any common or default values we want to use when Hiera can’t find a match for a given key elsewhere in our hierarchy. In this case, we’re going to use it to set common ntp servers and default configuration options for the ntp module.

Hierarchy and facts note: When constructing a hierarchy, keep in mind that most of the useful Puppet variables are facts. Since facts are submitted by the agent node itself, they aren’t necessarily trustworthy. We don’t recommend using facts as the sole deciding factor for distributing sensitive credentials.

In this example, we’re using the fqdn fact to identify specific nodes; the special clientcert variable is another option, which the agent sets to the value of its certname setting. (This is usually the same as the fqdn, but not always.) Right now, Puppet doesn’t provide any variables with trusted data about a given node; we’re currently investigating the possibility of adding some.

Puppet master note: If you modify hiera.yaml between agent runs, you’ll have to restart your puppet master for your changes to take effect.

Configuring for the Command Line

The Hiera command line tool is useful when you’re in the process of designing and testing your hierarchy. You can use it to mock in facts for Hiera to look up without having to go through cumbersome trial-and-error puppet runs. Since the hiera command expects to find hiera.yaml at /etc/hiera.yaml, you should set a symbolic link from your hiera.yaml file to /etc/hiera.yaml:

$ ln -s /etc/puppet/hiera.yaml /etc/hiera.yaml

Writing the Data Sources

Now that we’ve got Hiera configured, we’re ready to return to the ntp module and take a look at the ntp class’s parameters.

Learning About Hiera Data Sources: This example won’t cover all the data types you might want to use, and we’re only using one of two built-in data backends (JSON). For a more complete look at data sources, please see our guide to writing Hiera data sources, which includes more complete examples written in JSON and YAML.

Identifying Parameters

We need to start by figuring out the parameters required by the ntp class. So let’s look at the ntp module’s init.pp manifest, where we see five:

servers
An array of time servers; UNSET by default. Conditional logic in init.pp provides a list of ntp servers maintained by the respective maintainers of our module’s supported operating systems.
restrict
Whether to restrict ntp daemons from allowing others to use as a server; true by default
autoupdate
Whether to update the ntp package automatically or not; false by default
enable
Whether to start the ntp daemon on boot; true by default
template
The name of the template to use to configure the ntp service. This is undef by default, and it’s configured within the init.pp manifest with some conditional logic.

Making Decisions and Expressing Them in Hiera

Now that we know the parameters the ntp class expects, we can start making decisions about the nodes on our system, then expressing those decisions as Hiera data. Let’s start with kermit and grover: The two nodes in our organization that we allow to talk to the outside world for purposes of timekeeping.

kermit.example.com.json

We want one of these two nodes, kermit.example.com, to act as the primary organizational time server. We want it to consult outside time servers, we won’t want it to update its ntp server package by default, and we definitely want it to launch the ntp service at boot. So let’s write that out in JSON, making sure to express our variables as part of the ntp namespace to insure Hiera will pick them up as part of its automatic parameter lookup.

{
   "ntp::restrict" : false,
   "ntp::autoupdate" : false,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "0.us.pool.ntp.org iburst",
	   "1.us.pool.ntp.org iburst",
	   "2.us.pool.ntp.org iburst",
	   "3.us.pool.ntp.org iburst"
	   ]
}

Since we want to provide this data for a specific node, and since we’re using the fqdn fact to identify unique nodes in our hierarchy, we need to save this data in the /etc/puppet/hiera/node directory as kermit.example.com.json.

Once you’ve saved that, let’s do a quick test using the Hiera command line tool:

$ hiera ntp::servers fqdn=kermit.example.com

You should see this:

["0.us.pool.ntp.org iburst", "1.us.pool.ntp.org iburst", "2.us.pool.ntp.org iburst", "3.us.pool.ntp.org iburst"]

That’s just the array of outside ntp servers and options, which we expressed as a JSON array and which Hiera is converting to a Puppet-like array. The module will use this array when it generates configuration files from its templates.

Something Went Wrong? If, instead, you get nil, false, or something else completely, you should step back through your Hiera configuration making sure:

  • Your hiera.yaml file matches the example we provided
  • You’ve put a symlink to hiera.yaml where the command line tool expects to find it (/etc/hiera.yaml)
  • You’ve saved your kermit.example.com data source file with a .json extension
  • Your data source file’s JSON is well formed. Missing or extraneous commas will cause the JSON parser to fail
  • You restarted your puppet master if you modified hiera.yaml

Provided everything works and you get back that array of ntp servers, you’re ready to configure another node.

grover.example.com.json

Our next ntp node, grover.example.com, is a little less critical to our infrastructure than kermit, so we can be a little more permissive with its configuration: It’s o.k. if grover’s ntp packages are automatically updated. We also want grover to use kermit as its primary ntp server. Let’s express that as JSON:

{
   "ntp::restrict" : false,
   "ntp::autoupdate" : true,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "kermit.example.com iburst",
	   "0.us.pool.ntp.org iburst",
	   "1.us.pool.ntp.org iburst",
	   "2.us.pool.ntp.org iburst",
	   ]
}

As with kermit.example.com, we want to save grover’s Hiera data source in the /etc/puppet/hiera/nodes directory using the fqdn fact for the file name: grover.example.com.json. We can once again test it with the hiera command line tool:

$ hiera ntp::servers fqdn=grover.example.com
["kermit.example.com iburst", "0.us.pool.ntp.org iburst", "1.us.pool.ntp.org iburst", "2.us.pool.ntp.org iburst"]

common.json

So, now we’ve configured the two nodes in our organization that we’ll allow to update from outside ntp servers. However, we still have a few nodes to account for that also provide ntp services. They depend on kermit and grover to get the correct time, and we don’t mind if they update themselves. Let’s write that out in JSON:

{
   "ntp::restrict" : true,
   "ntp::autoupdate" : true,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "grover.example.com iburst",
	   "kermit.example.com iburst"
	  ]
}

Unlike kermit and grover, for which we had slightly different but node-specific configuration needs, we’re comfortable letting any other node that uses the ntp class use this generic configuration data. Rather than creating a node-specific data source for every possible node on our network that might need to use the ntp module, we’ll store this data in /etc/puppet/hiera/common.json. With our very simple hierarchy, which so far only looks for the fqdn facts, any node with a fqdn that doesn’t match the nodes we have data sources for will get the data found in common.json. Let’s test against one of those nodes:

$ hiera ntp::servers fqdn=snuffie.example.com
["kermit.example.com iburst", "grover.example.com iburst"]

Modifying Our site.pp Manifest

Now that everything has tested out from the command line, it’s time to get a little benefit from this work in production.

If you’ll remember back to our pre-Hiera configuration, we were declaring a number of parameters for the ntp class in our site.pp manifest, like this:

	node "kermit.example.com" {
	  class { "ntp":
		servers    => [ '0.us.pool.ntp.org iburst','1.us.pool.ntp.org iburst','2.us.pool.ntp.org iburst','3.us.pool.ntp.org iburst'],
		autoupdate => false,
		restrict => false,
		enable => true,
	  }
	}

In fact, we had three separate stanzas of that length. But now that we’ve moved all of that parameter data into Hiera, we can significantly pare down site.pp:

	node "kermit.example.com", "grover.example.com", "snuffie.example.com" {
      include ntp
      # or:
      # class { "ntp": }
	}

That’s it.

Since Hiera is automatically providing the parameter data from the data sources in its hierarchy, we don’t need to do anything besides assign the ntp class to the nodes and let Hiera’s parameter lookups do the rest. In the future, as we change or add nodes that need to use the ntp class, we can:

  • Quickly copy data source files to cover cases where a node needs a specialized configuration.
  • If the new node can work with the generic configuration in common.json, we can say include ntp in our site.pp without writing any new Hiera data.
  • Since Hiera looks up each parameter individually, we can also write a new JSON file that, for example, only changes ntp::autoupdate — Hiera will get the rest of the parameters from common.json.

If you’re interested in taking things a step further, using the decision-making skills you picked up in this example to choose which nodes even get a particular class, let’s keep going.

Assigning a Class to a Node With Hiera

In the first part of our example, we were concerned with how to use Hiera to provide data to a parameterized class, but we were assigning the classes to nodes in the traditional Puppet way: By making class declarations for each node in our site.pp manifest. Thanks to the hiera_include function, you can assign nodes to a class the same way you can assign values to class parameters: Picking a facter fact on which you want to base a decision, adding to the hierarchy in your hiera.yaml file, then writing data sources.

Using hiera_include

Where last we left off, our site.pp manifest was looking somewhat spare. With the hiera_include function, we can pare things down even further by picking a key to use for classes (we recommend classes), then declaring it in our site.pp manifest:

hiera_include('classes')

From this point on, you can add or modify an existing Hiera data source to add an array of classes you’d like to assign to matching nodes. In the simplest case, we can visit each of kermit, grover, and snuffie and add this to their JSON data sources in /etc/puppet/hiera/node:

"classes" : "ntp",

modifying kermit’s data source, for instance, to look like this:

{
   "classes" : "ntp",
   "ntp::restrict" : false,
   "ntp::autoupdate" : false,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "0.us.pool.ntp.org iburst",
	   "1.us.pool.ntp.org iburst",
	   "2.us.pool.ntp.org iburst",
	   "3.us.pool.ntp.org iburst"
	   ]
}

hiera_include requires either a string with a single class, or an array of classes to apply to a given node. Take a look at the “classes” array at the top of our kermit data source to see how we might add three classes to kermit:

{
   "classes" : [
	   "ntp",
	   "apache",
	   "postfix"
   ],
   "ntp::restrict" : false,
   "ntp::autoupdate" : false,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "0.us.pool.ntp.org iburst",
	   "1.us.pool.ntp.org iburst",
	   "2.us.pool.ntp.org iburst",
	   "3.us.pool.ntp.org iburst"
	   ]
}

We can test which classes we’ve assigned to a given node with the Hiera command line tool:

$ hiera classes fqdn=kermit.example.com
["ntp", "apache", "postfix"]

Note: The hiera_include function will do an array merge lookup, which can let more specific data sources add to common sources instead of replacing them. This helps you avoid repeating yourself.

Using Facts to Drive Class Assignments

That demonstrates a very simple case for hiera_include, where we knew that we wanted to assign a particular class to a specific host by name. But just as we used the fqdn fact to choose which of our nodes received specific parameter values, we can use that or any other fact to drive class assignments. In other words, you can assign classes to nodes based on characteristics that aren’t as obvious as their names, creating the possibility of configuring nodes based on more complex characteristics.

Some organizations might choose to make sure all their Macs have Homebrew installed, or assign a postfix class to nodes that have a mail server role expressed through a custom fact, or assign a vmware_tools class that installs and configures VMWare Tools packages on every host that returns vmware as the value for its virtual fact.

In fact, let’s use that last case for this part of the walkthrough: Installing VMWare Tools on a virtual guest. There’s a puppet-vmwaretools module on the Puppet Forge that addresses just this need. It takes two parameters:

version
The version of VMWare Tools we want to install
working_dir
The directory into which we want to install VMWare

Two ways we might want to use Hiera to help us organize our use of the class this module provides include making sure it’s applied to all our VMWare virtual hosts, and configuring where it’s installed depending on the guest operating system for a given virtual host.

So let’s take a look at our hiera.yaml file and make provisions for two new data sources. We’ll create one based on the virtual fact, which will return vmware when a node is a VMWare-based guest. We’ll create another based on the osfamily fact, which returns the general family to which a node’s operating system belongs (e.g. “Debian” for Ubuntu and Debian systems, or “RedHat” for RHEL, CentOS, and Fedora systems):

---
:backends:
  - json
:json:
  :datadir: /etc/puppet/hieradata
:hierarchy:
  - node/%{::fqdn}
  - virtual/%{::virtual}
  - osfamily/%{osfamily}
  - common

Next, we’ll need to create directories for our two new data sources:

`mkdir /etc/puppet/hiera/virtual; mkdir /etc/puppet/hiera/osfamily`

In our virtual directory, we’ll want to create the file vmware.json. In this data source, we’ll be assigning the vmwaretools class, so the file will need to look like this:

{
  "classes": "vmwaretools"
}

Next, we need to provide the data for the vmwaretools class parameters. We’ll assume we have a mix of Red Hat and Debian VMs in use in our organization, and that we want to install VMWare Tools in /opt/vmware in our Red Hat VMs, and /usr/local/vmware for our Debian VMs. We’ll need RedHat.json and Debian.json files in the /etc/puppet/hiera/osfamily directory.

RedHat.json should look like this:

{
  "vmwaretools::working_dir" : "/opt/vmware"
}

Debian.json should look like this:

{
 "vmwaretools::working_dir" : "/usr/local/vmware"
}

That leaves us with one parameter we haven’t covered: the version parameter. Since we don’t need to vary which version of VMWare Tools any of our VMs are using, we can put that in common.json, which should now look like this:

{
   "vmwaretools::version" : "8.6.5-621624",
   "ntp::restrict" : true,
   "ntp::autoupdate" : true,
   "ntp::enable" : true,
   "ntp::servers" : [
	   "grover.example.com iburst",
	   "kermit.example.com iburst"
	  ]
}

Once you’ve got all that configured, go ahead and test with the Hiera command line tool:

$ hiera vmwaretools::working_dir osfamily=RedHat
/opt/vmware

$ hiera vmwaretools::working_dir osfamily=Debian
/usr/local/vmware

$ hiera vmwaretools::version
8.6.5-621624

$ hiera classes virtual=vmware
vmwaretools

If everything worked, great. If not, consult the checklist we provided earlier and give it another shot.

Exploring Hiera Further

We hope this walkthrough gave you a good feel for the things you can do with Hiera. There are a few things we didn’t touch on, though:

  • We didn’t discuss how to use Hiera in module manifests, preferring to highlight its ability to provide data to parameterized classes. Hiera also provides a collection of functions that allow you to use Hiera within a module. Tread carefully here, though: Once you start requiring Hiera for your module to work at all, you’re reducing its shareability and potentially closing it off to some users.
  • We showed how to conduct priority lookups with Hiera; that is, retrieving data from the hierarchy based on the first match for a given key. This is the only way to use Hiera with parameterized classes, but Hiera’s lookup functions include special hash and array lookups, allowing you to collect data from sources throughout your hierarchy, or to selectively override your hierarchy’s normal precedence. This allows you to declare, for instance, certain base values for all nodes, then layer on additional values for nodes that match differing keys, receiving all the data back as an array or hash.

↑ Back to top