Nagios is from Mars, and MySQL is from Venus (Monitoring Part 2)

In my previous post about Nagios, I showed how the rich Nagios charm simplifies adding basic monitoring to Juju environments. But, we need more than that. Services know more about how to verify they are working than a monitoring system could ever guess.

So, for that, we have the 'monitors' interface. Currently sparsely documented, as it is extremely new, the idea is simple.

Providing charms define what monitoring systems should look at generically
Requiring charms monitor any of the things they can, ignoring those they cannot.

This is defined through a YAML file. Here is the example.monitors.yaml included with the nagios charm:

# Version of the spec, mostly ignored but 0.3 is the current one
version: '0.3'
# Dict with just 'local' and 'remote' as parts
monitors:
    # local monitors need an agent to be handled. See nrpe charm for
    # some example implementations
    local:
        # procrunning checks for a running process named X (no path)
        procrunning:
            # Multiple procrunning can be defined, this is the "name" of it
            nagios3:
                min: 1
                max: 1
                executable: nagios3
    # Remote monitors can be polled directly by a remote system
    remote:
        # do a request on the HTTP protocol
        http:
            nagios:
                port: 80
                path: /nagios3/
                # expected status response (otherwise just look for 200)
                status: 'HTTP/1.1 401'
                # Use as the Host: header (the server address will still be used to connect() to)
                host: www.fewbar.com
        mysql:
            # Named basic check
            basic:
                username: monitors
                password: abcdefg123456

There are two main classes of monitors: local, remote. This is in reference to the service unit's location. Local monitors are intended to be run inside the same machine/container as the service. Remote monitors are then, quite obviously, meant to be run outside the machine/container. So, above, you see remote monitors for the mysql protocol and http, and a local monitor to see if processes are running.

The MySQL charm now includes some of these monitors:

version: '0.3'
monitors:
    local:
        procrunning:
            mysqld:
                name: MySQL Running
                min: 1
                max: 1
                executable: mysqld
    remote:
        mysql:
            basic:
                user: monitors

The remote part is fed directly to nagios, which knows how to monitor mysql remotely, and so translates it into a check_mysql command. The local bits are ignored by Nagios. But, when we also relate the subordinate charm NRPE to a MySQL service, then we'll have an agent which understands local. It actually converts those into remote monitors of type 'nrpe' which Nagios does understand. So upon relating NRPE to Nagios, each subordinate unit feeds its unique NRPE monitors back to Nagios and they are added to the target units' monitors.

Honestly, this all sounds very complicated. But luckily, you don't have to really grasp it to take advantage of it in a charm. The whole point is this: All one needs to do is write a monitors.yaml, and add the monitors relation with this joined hook to your charm:

#!/bin/bash
# .. Anything you need to do to enable the monitoring host to access ports/users/etc goes here
relation-set monitors="$(cat monitors.yaml)" target-id=${JUJU_UNIT_NAME//\//-} target-address=$(unit-get private-address)

If you have local things you want to give to a monitoring agent, you can use the 'local-monitors' interface, which is basically the same as monitors, but only ever used in container scoped relations required by subordinate charms such as NRPE or collectd.

Now you can easily provide monitors to any monitoring system. If Nagios doesn't support what you want to monitor, its fairly easy to add support. And as more monitoring systems are charmed and have the monitors interface added, your charm will be more useful out of the box.

In the next post, which will wrap up this series on monitoring, I'll talk about how to add monitors support to some other monitoring systems such as collectd, and also how to write a subordinate charm to communicate your monitors to an external monitoring service.