Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

  • Data Skews in Postgres

    Elizabeth Christensen

    We recently gave a talk at SCaLE (Southern California Linux Expo) about common problems and solutions for managing large Postgres databases. One of the topics we covered was data skewing and partial indexing. This piqued some conference discussion afterwards so we wanted to do a deeper dive.

    Skewed data is when your data is kind of bunched up - essentially it is not evenly distributed. You might have one really large customer with a customer id that takes up more than half the rows in your events table. Or a default value that gets created and many of the values in a certain column represent defaults. If you graphed table data, skewed data just means that data would not appear in a symmetrical distribution, it would be unevenly distributed.

    Under the hood, Postgres knows what kind of data you have in your database and uses that information to create query plans and when to use indexes. In some cases, skewed data will result in a situation where Postgres is not using an index - thus making some queries less efficient.

    As a general rule, Postgres generally doesn't use an index if a single value is greater than 30% of the total data. So skewed data can nullify an index in cases where you’re using a single or multi-column index and one of your columns has skewed data.

    Finding skewed data in Postgres

    Read More
  • Holy Sheet! Remote Access CSV Files from Postgres

    Paul Ramsey

    An extremely common problem in fast-moving data architectures is providing a way to feed ad hoc user data into an existing analytical data system.

    Do you have time to whip up a web app? No! You have a database to feed, and events are spiraling out of control... what to do?

    How about a Google Sheet? The data layout is obvious, you can even enforce things like data types and required columns using locking and protecting, and unlike an Excel or LibreOffice document, it's always online, so you can hook the data into your system directly.

    Access Sheets Data Remotely

    Read More
  • pgBackRest File Bundling and Block Incremental Backup

    David Steele

    Crunchy Data is proud to support the pgBackRest project, an essential production grade backup tool used in our fully managed and self managed Postgres products. pgBackRest is also available as an open source project.

    pgBackRest

    Read More
  • 9 min read

    CI/CD with Crunchy Postgres for Kubernetes and Argo

    Bob Pacheco

    Continuous Integration / Continuous Delivery (CI/CD) is an automated approach in which incremental code changes are made, built, tested and delivered. Organizations want to get their software solutions to market as quickly as possible without sacrificing quality or stability. While CI/CD is often associated with application code, it can also be beneficial for managing changes to PostgreSQL database clusters.

    GitOps plays an important part in enabling CI/CD. If you are unfamiliar with GitOps, I recommend starting with my previous post on Postgres GitOps with Argo and Kubernetes

    Read More
  • High-compression Metrics Storage with Postgres Hyperloglog

    Christopher Winslett

    We have been talking a lot here about using Postgres for metrics, dashboards, and analytics. One of my favorite Postgres tools that makes a lot of this work easy and efficient is Hyperloglog

    Read More
  • Logical Replication on Standbys in Postgres 16

    Roberto Mello

    Postgres 16 is hot off the press with the beta release last week. I am really excited about the new feature that allows logical replication from standbys, allowing users to:

    • create logical decoding from a read-only standby
    • reduce the workload on the primary server
    • have new ways to achieve high-availability for applications that require data synchronization across multiple systems or for auditing purposes
    Read More
  • 9 min read

    SVG Images from Postgres

    Martin Davis

    PostGIS excels at storing, manipulating and analyzing geospatial data. At some point it's usually desired to convert raw spatial data into a two-dimensional representation to utilize the integrative capabilities of the human visual cortex. In other words, to see things on a map.

    PostGIS is a popular backend for mapping technology, so there are many options to choose from to create maps. Data can be rendered to a raster image using a web map server like GeoServer

    Read More
  • Tags and Postgres Arrays, a Purrrfect Combination

    Paul Ramsey

    In a previous life, I worked on a CRM system that really loved the idea of tags. Everything could be tagged, users could create new tags, tags were a key organizing principle of searching and filtering.

    The trouble was, modeled traditionally, tags can really make for some ugly tables and equally ugly queries. Fortunately, and as usual, Postgres has an answer.

    Today I’m going to walk through working with tags in Postgres with a sample database of 🐈 cats and their attributes

    • First, I’ll look at a traditional relational model
    • Second, I’ll look at using an integer array to store tags
    • Lastly, I’ll test text arrays directly embedding the tags alongside the feline information
    Read More
  • 5 Ways to Get Table Creation Information in Postgres

    Greg Sabino Mullane

    A question I hear from time to time with Crunchy Data clients and the Postgres community is:

    When was my Postgres database table created?

    Postgres does not store the creation date of tables, or any other database object. But fear not, there are a plethora of direct and indirect ways to find out when your table creation happened. Let's go through some ways to do this, ranging from easy to somewhat hard. All these solutions apply to indexes and other database objects, but tables are by far the most common request.

    1. Logging

    Read More
  • Practical AI with Postgres

    Craig Kerstiens

    There's a lot of excitement around AI, and even more discussion than excitement. The question of Postgres and AI isn't a single question, there are a ton of paths you can take under that heading...

    • Can I use Postgres for building AI related apps? Absolutely
    Read More