Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Featured Post

8 min read

Crunchy Data Warehouse: Postgres with Iceberg for High Performance Analytics

Marco Slot

We are excited to release Crunchy Data Warehouse, a modern data warehouse for Postgres. Crunchy Data Warehouse combines Postgres with Iceberg, Parquet, and data lake formats for fast analytics queries and cost efficient storage.

Read This article
  • 7 min read

    Incremental Archival from Postgres to Parquet for Analytics

    Marco Slot

    PostgreSQL is commonly used to store event data coming from various kinds of devices. The data often arrives as individual events or small batches, which requires an operational database to capture. Features like time partitioning help optimize the storage layout for time range filtering and efficient deletion of old data. The PostgreSQL feature set gives you a lot of flexibility for handling a variety of IoT scenarios, but there are certain scenarios for it is less suitable, namely: • Long-te...

    Read More
  • Postgres Parallel Query Troubleshooting

    Brian Pace

    Postgres' ability to execute queries in parallel is a powerful feature that can significantly improve query performance, especially on large datasets. However, like all resources, parallel workers are finite. When there aren't enough available workers, Postgres may downgrade a parallel query to a serial (non-parallel) execution. This sounds reasonable unless the performance of the downgraded query is well beyond the required response times needed by the application. While helping our clients w...

    Read More
  • 10 min read

    Using Cloud Rasters with PostGIS

    Paul Ramsey

    With the extension, it is possible to access gigabytes of raster data from the cloud, without ever downloading the data . How? The venerable extension (released 13 years ago ) already has the critical core support built-in! Rasters can be stored inside the database, or outside the database, on a local file system or anywhere it can be accessed by the underlying GDAL raster support library. The storage options include S3, Azure, Google, Alibaba, and any HTTP server that supports RANG...

    Read More
  • Indexing Materialized Views in Postgres

    Elizabeth Christensen

    Materialized views are widely used in Postgres today. Many of us are working with using connected systems through foreign data wrappers, separate analytics systems like data warehouses , and merging data from different locations with Postgres queries. Materialized views let you precompile a query or partial table, for both local and remote data. Materialized views are static and have to be refreshed. One of the things that can be really important for using materialized views efficiently is inde...

    Read More
  • When Does ALTER TABLE Require a Rewrite?

    Greg Sabino Mullane

    It is rare that a Postgres table keeps the exact same structure year after year. New columns get added. Old columns get dropped. Column data types need to change. Those are all done with the ALTER TABLE command. One big drawback to these changes is that they may force a complete table rewrite. A rewrite means a completely new copy of the table is created, and then the old one is dropped. This can take a very long time for large tables. Worse, everything else is blocked/locked from using the tabl...

    Read More
  • 19 min read

    Postgres Tuning & Performance for Analytics Data

    Karen Jex

    Your database is configured for the needs of your day-to-day OLTP (online transaction processing) application workload, but what if you need to run analytics queries against your application data? How can you do that without compromising the performance of your application? Application data gradually builds up in your database over time, and at some point the business wants to glean insights from it by running analytics queries. Analytics activity, sometimes called OLAP (online analytical proces...

    Read More
  • Running an Async Web Query Queue with Procedures and pg_cron

    Paul Ramsey

    The number of cool things you can do with the http extension is large, but putting those things into production raises an important problem. The amount of time an HTTP request takes, 100s of milliseconds, is 10- to 20-times longer that the amount of time a normal database query takes. This means that potentially an HTTP call could jam up a query for a long time. I recently ran an HTTP function in an update against a relatively small 1000 record table. The query took 5 minutes to run, and durin...

    Read More
  • 9 min read

    Name Collision of the Year: Vector

    Elizabeth Christensen

    I can’t get through a zoom call, a conference talk, or an afternoon scroll through LinkedIn without hearing about vectors. Do you feel like the term vector is everywhere this year? It is. Vector actually means several different things and it's confusing. Vector means AI data, GIS locations, digital graphics, and a type of query optimization, and more. The terms and uses are related, sure. They all stem from the same original concept. However their practical applications are quite different. So...

    Read More
  • 12 min read

    Sidecar Service Meshes with Crunchy Postgres for Kubernetes

    Andrew L'Ecuyer

    One of the great new features recently added to Kubernetes - native Sidecar Containers - continues to get closer to GA with each new Kubernetes release. I was reviewing all of the great progress recently made by the Kubernetes Enhancement Proposal (KEP) on Sidecar Containers and realized this feature has already produced some exciting results. For instance, this feature is already making it easier than ever before to use Crunchy Postgres for Kubernetes with two important service mesh solut...

    Read More
  • 11 min read

    pg_incremental: Incremental Data Processing in Postgres

    Marco Slot

    Today I’m excited to introduce pg_incremental , a new open source PostgreSQL extension for automated, incremental, reliable batch processing. This extension helps you create processing pipelines for append-only streams of data, such as IoT / time series / event data workloads. Notable pg_incremental use cases include: • Creation and incremental maintenance of rollups, aggregations, and interval aggregations • Incremental data transformations • Periodic imports or export of new data using standa...

    Read More
  • 6 min read

    Smarter Postgres LLM with Retrieval Augmented Generation

    Paul Ramsey

    "Retrieval Augmented Generation" (RAG) is a useful technique in working with large language models (LLM) to improve accuracy when dealing with facts in a restricted domain of interest. Asking an LLM about Shakespeare: works pretty good. The model was probably fed a lot of Shakespeare in training. Asking it about holiday time off rules from the company employee manual: works pretty bad. The model may have ingested a few manuals in training, but not yours ! Is there a way around this LLM limi...

    Read More