The ability to view and record metrics is incredibly useful in many situations, but isn't something we magically get. Having such a system in place allows you to analyse many different aspects of an application or system for both performance and health purposes. Metrics can be combined in countless ways to allow you to determine how resources are being used, to troubleshoot problems, or to find bottlenecks or extra resources which could be better utilised. In this post, I'm going to walk you through installing Prometheus - a very popular time series database - using Docker.
When it comes to monitoring, there's a lot of different paths you can go down, and as with most things, the choices you make are informed by what you aim to get out of the system. For me, I want to record the metric data which is already being exposed by the various servers and applications that I use. I also want to be able to access the metrics in a simple and performant manner.
In the past, I've mainly relied upon two time series databases - Prometheus, and InfluxDB and as with pretty much everything, they both have distinct pros and cons - though I'm not going to discuss those too much detail as it's worthy of a post in its own right.
The main reason I prefer Prometheus is simple, the apps I use all natively support exporting their metrics in Prometheus' format. So either I use Prometheus or I have find a way to get another TSD to interpret their format - and, well, that sounds like unnecessary effort. I do also like how Prometheus pulls the metrics from each endpoint, as opposed to Influx's default push method (where each client independently pushes the metrics to Influx) - another discussion worthy of its own post.
Getting started with Prometheus
Thanks to the simplicity of Docker, actully "installing" Prometheus is very easy, all we have to do is run the following command on a host that runs Docker.
docker run -d \ --name=prometheus \ -p 9090:9090 \ -v /path/on/host/prometheus/data:/prometheus \ -v /path/on/host/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \ --restart=unless-stopped \ prom/prometheus:latest
The Prometheus interface/api operates on TCP port 9090 by default and I have no reason to change that, so I have left it as is. Prometheus needs access to two things on the filesystem, firstly, it needs a place to store the data, and secondly, it'll need a configuration file.
I like to have a single folder on my docker hosts that contains the volumes for all of my containers (to greatly simplify management and backups) and all I had to do was add a
prometheus folder with the
prometheus.yml file and "data" subfolder inside. Then we just need to tell Docker to bind these paths to the container which is done using the -v (volume) parameter in the Docker command (the format is
The configuration file for Prometheus is straight forward, although it does use the YAML format (which is incredibly fussy about indentation and symbols). Anyway, here's just about the simplest config you can get, so go ahead and stick that inside the
global: scrape_interval: 10s evaluation_interval: 10s scrape_configs: - job_name: 'prometheus' static_configs: - targets: ['localhost:9090']
That tells Prometheus to scrape its own metric endpoint every 10 seconds and it'll save the metrics it finds to the database. You can add additional jobs and targets just by adding to that file. I use Netdata to collate a what is frankly, ridiculous number of metrics into the Prometheus format and expose that data on a HTTP endpoint. Adding the following config is all that is needed to get Prometheus to start recording that data (I've added some notes to explain what each part does, feel free to remove those):
# The job_name can be anything you like - job_name: 'netdata' # This is the HTTP path on the target that the metrics can be accessed metrics_path: '/api/v1/allmetrics' # This allows you to supply query parameters # Here it'll add "?format=prometheus" to the request as Netdata needs that params: format: [prometheus] honor_labels: true # This tells Prometheus to ignore invalid certificates tls_config: insecure_skip_verify: true # This is where you define the targets for this job static_configs: - targets: ['192.168.111.250']
That section of config basically tells Prometheus to scrape
192.168.111.250/api/v1/allmetrics?format=prometheus, it might look a little daunting at first but once you understand how it breaks it down, it's really quite straight forward.
An example of what Prometheus format metrics look like, this is a few lines of the many hundred that Netdata exports
You can add multiple jobs, and multiple targets to each job (I have all of my Netdata hosts specified in the above job). I have a separate job for GitLab because it uses a different URL path, and it requires an authentication token as a parameter - basically, you only need to create a new job when the hosts/apps have different requirements - or for organisation purposes.
Prometheus needs to reload the config after making any changes in order for them to apply, the simplest way to do this is just to restart the Docker container with
docker restart <name_of_prometheus_container>, if there's any errors in the config file, Prometheus won't start back up and you can see where the problem is by checking the container's log with
docker logs <name_of_prometheus_container>
If the config is correct, the web interface will start up and you'll be able to navigate to
<docker-host-ip>:9090/config and see the full, parsed config. Also, if you go to
<docker-host-ip>:9090/targets you can see a summary of the targets for each job and when they were last scraped - there's all sorts of information available on the web interface, and as with anything new, I'd suggest having a look around.
Viewing the Data
Prometheus comes with a built in metric viewer on it's web interface which is incredibly handy for testing queries and finding which metrics you want to use in an interface such as Grafana. The viewer is exposed as
<docker-host-ip>:9090/graph. It's got a rather nice auto suggest style feature where it shows any metric names that contain what you have typed.
Here's an example which shows one of Prometheus' own metrics,
prometheus_tsdb_head_series - it's a count of how many individual time series Prometheus is storing (note that you can switch between a console (table) and graph view.
A couple of notes
I intended to cover a lot more than I have ended up covering in this post, I didn't quite realise how much there is to go through. I'll be posting more about Netdata, Grafana, the use of InfluxDB to store metrics beyond Prometheus' default 15 days of retention, and more, in the future.
If you've followed this post then hopefully, you now have a Prometheus server that is scraping all of the metrics for your soon-to-be prize worthy dashboard, obviously, there's still a way to go as we don't really have a way to view the data - at least, a convenient one. I'd highly recommend taking a look at Grafana for that purpose.
Short link: on-te.ch/prom
Obviously Prometheus will need firewall (and whatever else) access to the targets/paths that you specify, the curl command is incredibly useful for making sure the Docker host (and unless you've done something fancy, the Prometheus container) can access those pages. ↩︎
There are many ways to specify multiple targets, you can have Prometheus automatically find targets through service discovery (prometheus.io/blog/2015/06/01/advanced-service-discovery/). Or you can keep it simple and do as I have done, e.g.:
['192.168.111.221:19999', '192.168.111.222:19999', '192.168.111.223:19999']↩︎