Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Posts about Production Postgres

  • Active Active in Postgres 16

    Brian Pace

    Support for logical replication arrived at Postgres just over five years ago with Postgres 10. Since then it's had a steady stream of improvements, but mostly logical replication has been limited to migrating data or unidirectional change data capture workflows. With Postgres 16 freshly released today, Postgres now has a better foundation to leverage logical replication for active-active setups.

    What is logical replication and active-active?

    Read More
  • Remote Access Anything from Postgres

    Paul Ramsey

    In my last blog post, I showed four ways to access a remotely hosted CSV file from inside PostgreSQL:

    • Using the COPY command with the PROGRAM option,
    • Using the http extension
    Read More
  • Performance Tips for Postgres FDW

    Christopher Winslett

    We have a lot of Postgres foreign data wrapper (FDW) users here at Crunchy. The postgres_fdw lets you connect to remote servers and in some cases can be an alternative for traditional ETL/ELT processes. As I see use of the Postgres foreign data wrapper expanding, I wanted to make some recommendations on how to approach performance. When you query from a single Postgres cluster, the database cleverly uses internal statistics to auto-tune performance

    Read More
  • Data Skews in Postgres

    Elizabeth Christensen

    We recently gave a talk at SCaLE (Southern California Linux Expo) about common problems and solutions for managing large Postgres databases. One of the topics we covered was data skewing and partial indexing. This piqued some conference discussion afterwards so we wanted to do a deeper dive.

    Skewed data is when your data is kind of bunched up - essentially it is not evenly distributed. You might have one really large customer with a customer id that takes up more than half the rows in your events table. Or a default value that gets created and many of the values in a certain column represent defaults. If you graphed table data, skewed data just means that data would not appear in a symmetrical distribution, it would be unevenly distributed.

    Under the hood, Postgres knows what kind of data you have in your database and uses that information to create query plans and when to use indexes. In some cases, skewed data will result in a situation where Postgres is not using an index - thus making some queries less efficient.

    As a general rule, Postgres generally doesn't use an index if a single value is greater than 30% of the total data. So skewed data can nullify an index in cases where you’re using a single or multi-column index and one of your columns has skewed data.

    Finding skewed data in Postgres

    Read More
  • pgBackRest File Bundling and Block Incremental Backup

    David Steele

    Crunchy Data is proud to support the pgBackRest project, an essential production grade backup tool used in our fully managed and self managed Postgres products. pgBackRest is also available as an open source project.

    pgBackRest

    Read More
  • High-compression Metrics Storage with Postgres Hyperloglog

    Christopher Winslett

    We have been talking a lot here about using Postgres for metrics, dashboards, and analytics. One of my favorite Postgres tools that makes a lot of this work easy and efficient is Hyperloglog

    Read More
  • Logical Replication on Standbys in Postgres 16

    Roberto Mello

    Postgres 16 is hot off the press with the beta release last week. I am really excited about the new feature that allows logical replication from standbys, allowing users to:

    • create logical decoding from a read-only standby
    • reduce the workload on the primary server
    • have new ways to achieve high-availability for applications that require data synchronization across multiple systems or for auditing purposes
    Read More
  • 5 Ways to Get Table Creation Information in Postgres

    Greg Sabino Mullane

    A question I hear from time to time with Crunchy Data clients and the Postgres community is:

    When was my Postgres database table created?

    Postgres does not store the creation date of tables, or any other database object. But fear not, there are a plethora of direct and indirect ways to find out when your table creation happened. Let's go through some ways to do this, ranging from easy to somewhat hard. All these solutions apply to indexes and other database objects, but tables are by far the most common request.

    1. Logging

    Read More
  • Building Customer-Facing Real-Time Dashboards with Postgres

    Craig Kerstiens

    Over the past few weeks we've had several customers ask how they should architect their analytics pipeline. Common questions are:

    • Should we use some kind of data warehouse or time series database?
    • Is Postgres suitable for that type of workload?
    • What are the pitfalls that I should worry about before I get started?
    Read More
  • Thinking Fast vs. Slow with Your Data in Postgres

    Craig Kerstiens

    Today, we wanted to address some basic principles for better managing data architecture. Postgres is well regarded as a database for traditional system of record. More recently we've been fielding questions on what else can it do, such as: Can it be good for analytics and metrics? The short answer is "yes". When applications expand outside their standard system of record, they add in new types of data and data stores, which introduces complexity managing multiple types of systems.

    Some common workloads for Postgres are:

    • Primary system of record
    • Metrics / analytics data
    • Log / event data
    Read More