Prometheus monitoring: blackbox exporter

The blackbox exporter periodically checks endpoints (for example, an HTTP page) and measures latency and availability. Some things we could monitor with this:

  • ocfweb endpoints (webpages and APIs)
  • user web hosting
  • DNS
  • see if hosts respond to ping
  • IRC
  • probably lots more

Since the blackbox exporter would run from inside our network, this wouldn’t be able to detect outages related to our connectivity to the outside world (which definitely happen), but this is still useful since it would help us debug outages.

The blackbox exporter would run in a Docker container on Marathon, similar to how snmp_exporter is set up. This is a great way to learn how we spawn new Marathon services.

Once this is set up, we can discuss what things we’d like to alert on.