Puppet/OpenVox: easy incremental rollout

We are trying something new: Some how-to posts that are useful to international audiences will be published in English, so they can be shared more widely 🙂

We use Puppet, or rather the OpenVox fork, as a our configuration management software for Linux machines.

Sometimes there are changes you do not want to immediately roll out everywhere, but rather step by step and slowly. One way to do this in Puppet is using environments: some machines will receive their configuration from a different Git branch. For manual testing on a handful of machines this is sufficient. But what if you have a change that you want to roll out to an increasing number of machines until you hit 100%? Environments are not a great fit:

  • Coworkers want to roll out their own changes; therefore we only keep machines on a different environment for a limited time
  • We’d have to manually assign the environment to an increasing amount of machines
  • When choosing test machines, people will probably not pick machines perfectly at random, reducing the breadth of machine variations that are tested

As luck would have it, there is a way to roll out something to a fraction of machines, without having to manually assign machines, and without using an environment:


$configure_ssh = (fqdn_rand(100) < 10)
if $configure_ssh {
  include profile::ssh::daemon
}

This snippet of Puppet code loads the class profile::ssh::daemon on a random 10% of machines. The rollout can be continued by increasing the number from 10 to 20, 25, 50, 75, etc. Afterwards one can remove the condition and always load the class.

The keystone is the Puppet function fqdn_rand: when called with 100 as the argument, it will return a random integer number from 0 to 99. The value is not different for each Puppet run, because the machine’s FQDN is the seed for the RNG (random number generator). Therefore, every run on the same machine will return the same result.

If we assume that the RNG’s values are distributed uniformly, every number from 0 to 99 has a chance of 1% to be returned — think of it as a 100-sided die. Since we have over 1000 machines, this is accurate enough — a few machines more or less don’t matter. We then compare the return value of fqdn_rand with our desired rollout percentage, e.g. 10: fqdn_rand(100) ranges from 0 to 99, meaning fqdn_rand(100) < 10 is true for 10% of the numbers (0-9), and the rollout condition is true. The other 90% will not receive the change.

This small piece of Puppet code is a powerful tool to perform incremental rollouts.

If this snippet caught your attention, and you are living in Germany, we are hiring. 😉

(Header image: Dietmar Rabich / Wikimedia Commons / “Würfel — 2021 — 4265” / CC BY-SA 4.0)