Only Run Puppet Resources If Service Exists

I recently ran into a bit of a struggle configuring some infrastructure with Puppet. I created a module that manages .jar files on several servers, and I was in the process of expanding the amount of items that particular module applied to. My problem stemmed from the fact that I had a large number of devices in node definitions that would need these .jar files, but not all of the devices in each node definition needed them… only ones running a particular service for LogicMonitor. My first thought was to break them out into new node definitions, but that would’ve meant going from the 3 node definitions I had for this particular set of devices and effectively doubling that to 6 for ones that were running the service in question and the ones that weren’t. I didn’t really relish the idea of making these node definitions more complicated than they already were.

My hope was that I could instead apply my module to all of the devices and then in my module’s definition itself only take the action of managing the .jar files and the state of the service if the service existed. Unfortunately, this was easier said than done. While it’s trivial in Puppet to manage a service (e.g. to ensure that it’s running), there isn’t really a good way to use the existence of a service as part of a conditional that I could wrap all of the resources in. The closest I came to was having an exec resource that checks for the state of the service and writes it to a file. It’s technically possible to then save the content of that file to a variable:

$content = file('/path/to/file')

The problem is that it’s impossible to use a require or notify property with this. Since Puppet code does not execute sequentially, it would cause the execution of the entire module to fail on the initial run since the file wouldn’t exist. Back to the drawing board. My next thought (which probably should have been my first thought) was to see if services were included by facter when a device reports in with its facts. Services didn’t get included, but it got me down the rabbit hole of custom facts. External facts, in particular, seemed like exactly what I needed. I could include an arbitrary script that a device would execute and report back as a fact. There are a few formats that can be used, but I opted to just use JSON. What was unclear to me at the time, though, was where this script needed to live. The documentation for external facts gave a few places that work but recommended this:

<MODULEPATH>/<MODULE>/facts.d/

MODULEPATH just means the modules directory in the environment where the module lives. It looks like:

Puppet repo
- environments
  - modules
    - jar module
      - files
      - facts.d
      - manifests

The code itself was super simple. I wouldn’t really recommend bespoke JSON in most instances, but this was so simple I didn’t want to add another dependency like jq.

#!/usr/bin/env bash
IS_COLL=$(systemctl status logicmonitor-agent.service > /dev/null 2>&1 && echo -n true || echo -n false)
echo "{\"logicmonitor\": {\"is_collector\": \"${IS_COLL}\"}}"

Then I can use this like any other fact to make my conditional:

if $facts['logicmonitor']['is_collector'] == 'true' {
  # Do things.
}

My only remaining point of confusion was how the devices would know to report this fact. Puppet works with an agent on each devices that executes facter and phones into the Puppet master. The Puppet master looks at those facts and uses them to determine if the device needs to do anything to bring its configuration into alignment. With a new fact being added to a module, it wouldn’t be included in facter‘s output. This confusion was cleared up after my first test, though. The Puppet master is aware of the existence of an external fact, so when a device checks in, the first thing the Puppet master has it do is execute the script for the custom fact and report that in as well. Only then does the normal Puppet process of making configuration changes resume. It does add an extra step (and an extra bit of communication between the two devices), but in most environments it should be negligible.