Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

  • PostgreSQL Deep Dive: How Your Data Model Affects Storage

    Joe Conway

    I want to take a few minutes for a deep dive into the effect your data model has on storage density when using PostgreSQL. When this topic came up with a customer, I explained my thoughts on the matter, but I realized at the time that I had never done a reasonably careful apples-to-apples test to see just exactly what the effect is, at least for a model sample size of one. So here it is.

    Base Use Case

    Read More
  • Managing Transaction ID Exhaustion (Wraparound) in PostgreSQL

    Keith Fiske

    One of the most critical topics to understand when administering a PostgreSQL database is the concept of transaction IDs (TXID) and that they can be exhausted if not monitored properly. However, this blog post isn't going to go into the details of what it TXID exhaustion actually is. The Routine Vacuuming

    Read More
  • A Guide to Building an Active-Active PostgreSQL Cluster

    Dave Cramer

    One of the toughest challenges facing database engineers today is ensuring their data is always accessible so they can meet the high-availability requirements for their applications.

    While this problem may seem trivial in the world of applications where one can have many instances behind geographically distributed load balancers, in the database world where there is only one version of the data globally, the problem faces many obstacles.

    PostgreSQL replication has advanced considerably in recent major releases, including continuous improvements to streaming replication and the addition of logical replication in PostgreSQL 10. While these capabilities represent important enhancements to PostgreSQL, enabling users to address a wide variety of clustered architectures, they do not (easily) address the use cases where the application requires access to an updatable database in two or more geographic regions - often referred to as an "active-active" cluster.

    More specifically, an active-active cluster is one where the application can write to any instance in the cluster and the data will be written to all of the instances in the cluster, enabling each instance in the cluster to be used to:

    • Provide near zero downtime as the new instance is already in a read/write state; there is no need to reconfigure it.

    • Provide near zero downtime upgrades from one version to another

    • Improve latency for users in geographically distributed clusters. By providing an instance physically closer to the user latency is reduced.

    Read More
  • 4 min read

    What's New in Crunchy PostgreSQL Operator 3.5

    Jeff McCormick

    Crunchy Data is happy to announce the release of the open source PostgreSQL Operator 3.5 for Kubernetes project, which you can find here: https://github.com/CrunchyData/postgres-operator/

    Read More
  • 5 min read

    Scheduling Backups En Masse with the Postgres Operator

    Jonathan S. Katz

    An important part of running a production PostgreSQL database system (and for that matter, any database software) is to ensure you are prepared for disaster. There are many ways to go about preparing your system for disaster, but one of the simplest and most effective ways to do this is by taking periodic backups of your database clusters.

    How does one typically go about setting up taking a periodic backup? If you’re running PostgreSQL

    Read More
  • 3 min read

    Get Started Running PostgreSQL on Kubernetes

    Jonathan S. Katz

    Interested in running PostgreSQL natively on Kubernetes? Let's look at a few quick steps to get up and running with the open source Crunchy PostgreSQL Operator for Kubernetes on your choice of Kubernetes deployment.

    The Crunchy PostgreSQL Operator

    Read More
  • Performing a Major PostgreSQL Upgrade with pg_dumpall

    David Thomas

    For most major upgrades using a utility such as pg_upgrade or a replication tool such as pglogical will be the best solution. However if these options are not available, pg_dumpall

    Read More
  • 7 min read

    Won’t You Be My Neighbor? Quickly Finding Who is Nearby

    Jonathan S. Katz

    Many applications these days want us to know how close we are to things:

    • What are the three closest coffee shops to my current location?
    • Which is the nearest airport to the office?
    • What are the two closest subway stops to the restaurant?

    and countless more examples.

    Another way of asking these questions is to say “who are my nearest neighbors to me?” This maps to a classic algorithmic problem: efficiently finding the K-nearest neighbors

    Read More
  • Why Covering Indexes in Postgres Are Incredibly Helpful

    Jonathan S. Katz

    The PostgreSQL 11 release is nearly here (maybe in the next couple of weeks?!), and while a lot of the focus will be on the improvements to the overall performance of the system (and rightly so!), it's important to notice some features that when used appropriately, will provide noticeable performance improvements to your applications.

    One example of such feature is the introduction of "covering indexes" for B-tree indexes. A covering index

    Read More
  • 5 min read

    Using the CIS PostgreSQL Benchmark to Enhance Your Security

    Jonathan S. Katz

    Crunchy Data recently announced the publication of the CIS PostgreSQL Benchmark by the Center for Internet Security, a nonprofit organization that provides publications around standards and best practices for securing technologies systems. This CIS PostgreSQL Benchmark builds on earlier work that Crunchy started when it helped to publish the PostgreSQL Security Technical Implementation Guide

    Read More