Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Reducing Cloud Spend: Migrating Logs from CloudWatch to Iceberg with Postgres

Craig Kerstiens

5 min readMore by this author

As a database service provider, we store a number of logs internally to audit and oversee what is happening within our systems. When we started out, the volume of these logs is predictably low, but with scale they grew rapidly. Given the number of databases we run for users on Crunchy Bridge, the volume of these logs has grown to a sizable amount. Until last week, we retained those logs in AWS CloudWatch. Spoiler alert: this is expensive.

While we have a number of strategies to drive efficiency around the logs, we retain and we regularly remove unnecessary noise or prune old logs. That growth has driven AWS CloudWatch to represent a sizable portion of our infrastructure spend.

Going forward, we now have a new workflow that makes use of low cost S3 storage with Iceberg tables and the power and simplicity of Crunchy Data Warehouse, which has reduced our spend on logging by over $30,000 a month.

Using this new workflow, we can simply:

  • archive logs directly into S3
  • incrementally load those logs into Iceberg via Crunchy Data Warehouse
  • use SQL to query the logs required using Crunchy Data Warehouse

The crux of any log ingestion service is more or less: ingest log traffic, index the data, offload the logs to more cost efficient storage, and, when necessary, access later.

Historically, we used AWS CloudWatch but there are many logging services available. These services offer a range of capabilities, but come with a price tag representing a premium to the cost of storing logs directly in S3. While simply exporting logs to S3 always represented a potential cost savings, without a query engine to efficiently investigate these logs when required, exporting logs to S3 was not previously a viable solution.  Crunchy Data Warehouse's ability to easily query S3 was the breakthrough we needed.

Setting up logs with S3 and Iceberg

The first step? Get all of our logs flowing into S3.

Every server in our fleet, whether that be a server running our customer’s Postgres workloads or the servers that make up the Crunchy Bridge service itself, is running a logging process that continuously collects a variety of logs. The logs are generated from various sources. A few examples are SSH access, the Linux kernel, and Postgres. These logs all have different schemas and encodings that the logging agent transforms into a consistent CSV structure before batching and flushing them to durable, long-term storage. Once these logs make it off host, they are indexed and stored where they can be queried as needed.

Now that we have our logs flowing in S3, we provision a Crunchy Data Warehouse  so we can:

  1. Move the data from CSV to Iceberg for better compression

  2. Query our logs using standard SQL with Postgres.

Once the warehouse is provisioned, create a foreign table from within Crunchy Data Warehouse called logs that points at the S3 bucket's CSV files:

create foreign table logs (
   /* column names and types */
)
server crunchy_lake_analytics
options (path 's3://crunchy-bridge/tmp/*.tsv.gz', format 'csv', compression 'gzip', delimiter E'\t', filename 'true');

Now we create a fully managed Iceberg table that is an exact copy of the foreign table referencing the CSVs. Here Iceberg is beneficial because it will automatically compress the data into parquet files of 512 MB per file, know how to add data easily across files, push down queries that are targeting only a narrow window. Essentially, we've gone from CSV to columnar file format and from flat files to a full database:

-- Create an Iceberg table with the same schema
create table logs_iceberg (like logs)
using iceberg;

Finally, we're going to layer in the open source extension pg_incremental. Pg_incremental is a Postgres extension that makes it easy to do fast, reliable incremental batch processing within Postgres. pg_incremental is most commonly used for incremental rollups of data. In this case it is equally useful for processing new CSV data as it arrives and moving it into our Iceberg table within S3–connected to Postgres.

-- Set up a pg_incremental job to process existing files and automatically process new files every hour
select incremental.create_file_list_pipeline('process-logs',
   file_pattern := 's3://crunchy-bridgetmp/*.tsv.gz',
   batched := true,
   max_batch_size := 20000,
   schedule := '@hourly',
   command := $$
       insert into logs_iceberg select * from logs where _filename = any($1)
   $$);

Final thoughts

And there you have it! Cheaper, cleaner log management. As one of my colleagues described it: “personally, I always hated the imitation SQL query languages of logging providers–just get me real SQL”. Between using SQL to query logs, to simplifying our stack, to the cost savings - this project showcases some of our favorite things about Crunchy Data Warehouse.

We often get questions on the architecture of Crunchy Bridge. We have talked about it a bit. The short version is that Crunchy Bridge is built from the ground up using public cloud primitives to create a highly scalable and efficiently managed Postgres service. At the time, AWS CloudWatch was chosen due to the lack of better options. We don't want to be a logging provider, it's a fundamentally different business. But seeing how well this works, who knows 😉