Introducing the Postgres Prometheus Adapter

Prometheus is a popular open source monitoring tool and we have many customers that leverage it when using the Crunchy PostgreSQL Operator or Crunchy PostgreSQL High Availability. Prometheus ships out-of-the-box with its own time series data store but we’re big fans of Postgres, and we know Postgres can do time series just fine. Furthermore, if you’re already running PostgreSQL and using Prometheus to monitor it, why not just store that data in a Postgres database?

Just because you can do something, doesn’t mean you should, but in this case it's not such a bad idea. By storing Prometheus metric data natively in Postgres we can leverage many of the other features of PostgreSQL including:

Advanced querying
High availability
Extensibility

To make it easier for anyone that wants to use Postgres as their backing store for Prometheus, we’re proud to announce the release of the PostgreSQL Prometheus Adapter.

Prometheus + Postgres

Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.

Postgres is already a powerful, open source object-relational database system with over 20 years of active development and known for its reliability, feature robustness, and performance.

PostgreSQL 12, released in 2019, brought major enhancements to its partitioning functionality, with an eye towards time-series data. The improvements include:

Partitioning performance enhancements, including improved query performance on tables with thousands of partitions, improved insertion performance with INSERT and COPY, and the ability to execute ALTER TABLE ATTACH PARTITION without blocking queries
Improved performance of many operations on partitioned tables
Allows tables with thousands of child partitions to be processed efficiently by operations that only affect a small number of partitions
Improve speed of COPY into partitioned tables

These time series improvements make it a great candidate to back our Prometheus monitoring setup.

Prometheus Storage Adapter

The Prometheus remote storage adapter concept allows for the storage of Prometheus time series data externally using a remote write protocol. This externally stored time series data can be read using remote read protocol.

For Prometheus to use PostgreSQL as remote storage, the adapter must implement a write method. This method will be called by Prometheus when storing data.

func (c *Client) Write(samples model.Samples) error {
   ...
}

For Prometheus to read remotely stored, the adapter must also implement a read method. This method will be called by Prometheus clients requesting data.

func (c *Client) Read(req *prompb.ReadRequest) (*prompb.ReadResponse, error)
{
   ...
}

PostgreSQL Prometheus Adapter

PostgreSQL Prometheus Adapter is a remote storage adapter designed to utilize PostgreSQL 12 native partitioning enhancements to efficiently store Prometheus time series data in a PostgreSQL database.

The PostgreSQL Prometheus Adapter design is based on partitioning and threads. Incoming data is processed by one or more threads and one or more writer threads will store data in PostgreSQL daily or hourly partitions. Partitions will be auto-created by the adapter based on the timestamp of incoming data.

Let’s build our PostgreSQL Prometheus Adapter setup

git clone https://github.com/CrunchyData/postgresql-prometheus-adapter.git

cd postgresql-prometheus-adapter
make

cd postgresql-prometheus-adapter
make container

You can also tweak a number of settings for the adapter as well. To get a look at all the settings you can configure:

./postgresql-prometheus-adapter --help
usage: postgresql-prometheus-adapter []

Remote storage adapter [ PostgreSQL ]

Flags:
  -h, --help                           Show context-sensitive help (also try --help-long and --help-man).
      --adapter-send-timeout=30s       The timeout to use when sending samples to the remote storage.
      --web-listen-address=":9201"     Address to listen on for web endpoints.
      --web-telemetry-path="/metrics"  Address to listen on for web endpoints.
      --log.level=info                 Only log messages with the given severity or above. One of: [debug, info, warn, error]
      --log.format=logfmt              Output format of log messages. One of: [logfmt, json]
      --pg-partition="hourly"          daily or hourly partitions, default: hourly
      --pg-commit-secs=15              Write data to database every N seconds
      --pg-commit-rows=20000           Write data to database every N Rows
      --pg-threads=1                   Writer DB threads to run 1-10
      --parser-threads=5               parser threads to run per DB writer 1-10

Depending on your metric counts, we’d recommend configuring:

--parser-threads - this controls how many threads will be started to process incoming data
--pg-partition - control whether to use hourly or daily partitions
--pg-threads - controls how many database writer threads will be started to insert data in database

Putting it all together

First we’re going to configure out Prometheus setup to have our remote write and read endpoints. To do this edit your prometheus.yml and then restart your Prometheus:

remote_write:
  - url: 'http://:9201/write'
remote_read:
  - url: 'http://:9201/read'

Next we’re going to set the environment variable for our database, start the adapter.

export DATABASE_URL="postgres://username:password@host:5432/database"
cd postgresql-prometheus-adapter
./postgresql-prometheus-adapter

If you’re running everything inside a container you setup will look a bit different:

podman run --rm  \
  --name postgresql-prometheus-adapter  \
  -p 9201:9201 \
  -e DATABASE_URL="postgres://username:password@host:5432/database"  \
  --detach \
  crunchydata/postgresql-prometheus-adapterl:latest

The following environment settings can be passed to podman for tweaking adapter settings:

adapter_send_timeout=30s       The timeout to use when sending samples to the remote storage.
web_listen_address=":9201"     Address to listen on for web endpoints.
web_telemetry_path="/metrics"  Address to listen on for web endpoints.
log_level=info                 Only log messages with the given severity or above. One of: [debug, info, warn, error]
log_format=logfmt              Output format of log messages. One of: [logfmt, json]
pg_partition="hourly"          daily or hourly partitions, default: hourly
pg_commit_secs=15              Write data to database every N seconds
pg_commit_rows=20000           Write data to database every N Rows
pg_threads=1                   Writer DB threads to run 1-10
parser_threads=5               parser threads to run per DB writer 1-10

How do I test the adapter without running a Prometheus instance ?

You can simulate the Prometheus interface using:

Avalanche supports load testing for services accepting data via Prometheus remote_write API.

./avalanche \
  --remote-url="http://:9201/write" \
  --metric-count=10 \
  --label-count=15 \
  --series-count=30 \
  --remote-requests-count=100 \
  --remote-write-interval=100ms

Feel free to tweak the above settings based on your test case.

Latest Articles