Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

  • 12 min read

    Five Great Features of the PostgreSQL Partition Manager

    Keith Fiske

    After much testing and work the PostgreSQL Partition Manager, pg_partman, version 5 is now available for public release. Thanks to everyone involved for helping me get here!

    My recent post

    Read More
  • Understanding Postgres IOPS: Why They Matter Even When Everything Fits in Cache

    John Dalton

    Disk IOPS (Input/Output Operations Per Second) is a key metric for measuring the performance of any disk system. It represents the number of read and write operations that can be performed per second. For PostgreSQL, which relies heavily on disk access, understanding and optimizing disk IOPS is crucial for achieving optimal performance. Today I want to go through the key topics involved with IOPs - what is it, how does it impact Postgres, how to measure it, and what to change for better performance.

    What Are IOPS for Postgres?

    Read More
  • Rolling the Dice with the PostgreSQL Random Functions

    Paul Ramsey

    Generating random numbers is a surprisingly common task in programs, whether it's to create test data or to provide a user with a random entry from a list of items.

    PostgreSQL comes with just a few simple foundational functions that can be used to fulfill most needs for randomness.

    Almost all your random-ness needs will be met with the random()

    Read More
  • 6 min read

    JSON Updates in Postgres 16

    Christopher Winslett

    Postgres has been steadily building on the JSON functionality initially released more than 10 years ago. With Postgres 16, working with JSON has gotten a couple nice improvements. Primarily, this release added features that ease the manipulation of data into JSON and improve the standard SQL functionality using JSON.

    TL;DR:

    Read More
  • Active Active in Postgres 16

    Brian Pace

    Support for logical replication arrived at Postgres just over five years ago with Postgres 10. Since then it's had a steady stream of improvements, but mostly logical replication has been limited to migrating data or unidirectional change data capture workflows. With Postgres 16 freshly released today, Postgres now has a better foundation to leverage logical replication for active-active setups.

    What is logical replication and active-active?

    Read More
  • 6 min read

    Random Geometry Generation with PostGIS

    Paul Ramsey

    A user on the postgis-users had an interesting question today: how to generate a geometry column in PostGIS with random points, linestrings, or polygons?

    Random data is important for validating processing chains, analyses and reports. The best way to test a process is to feed it inputs!

    Random Points

    Read More
  • 12 min read

    HNSW Indexes with Postgres and pgvector

    Christopher Winslett

    Postgres’ pgvector extension recently added HNSW as a new index type for vector data. This levels up the database for vector-based embeddings output by AI models. A few months ago, we had written about approximate nearest neighbor pgvector performance using the available list-based indexes

    Read More
  • Top 10 Postgres Management Tasks

    Elizabeth Christensen

    1. Add a statement timeout

    Postgres databases are very compliant, they do what you tell them until you tell them to stop. It is really common for a runaway process, query, or even something a co-worker runs to accidentally start a never ending transaction

    Read More
  • 12 min read

    Scaling Vector Data with Postgres

    Christopher Winslett

    Note: We have additional articles in this Postgres AI series.

    Vector data has made its way into Postgres and I’m seeing more and more folks using it by the day. As I’ve seen use cases trickle in, I have been thinking a lot about scaling data and how to set yourself up for performance success from the beginning. The two primary trade-offs are performance versus accuracy. When seeking performance with vector data, we are using nearest neighbor algorithms, and those algorithms are built around probability of proximity. If your use-case requires 100% accuracy on nearest neighbor, performance will be sacrificed.

    After choosing between performance versus accuracy, the next tools in the toolbox are caching and partitioning. Caching is obvious in some situations, if your product is finding “similar meals” or “similar products” or “similar support questions”, then the similarities will not change rapidly.

    For the most part, the keys to scaling AI data are the same as scaling any other data type: reduce the number of rows in index and reduce the concurrent queries hitting the database. Once the index has done its work, CPU becomes the primary constraint: how fast can you calculate and compare distances between vectors? Scaling vector data is currently about performance mitigation as much as it is overpowering the data.

    In the next few weeks, the Postgres pg_vector extension is launching HNSW indexes (see the commit history for pgvector

    Read More
  • 10 min read

    Use CI/CD to Automatically Update Postgres Images with Argo CD

    Bob Pacheco

    When working with containers you always have to be mindful of the age of the containers. Every day new CVEs are being discovered and are turning up in image scans. One benefit of having a CI/CD pipeline is the ability to implement security automation. Let's assume you release a monthly update of your containers that are built on the latest version of the base image and all of the most recent patches have been applied. This ensures that each month you can remediate any CVEs that might have popped up in your images since their initial release. In this blog we show you how to use ARGO CD Image Updater as part of your CI/CD pipeline to automatically deploy, test, and promote your updated images. All by doing nothing more than putting them into your registry.

    workflow

    This is part 2 of CI/CD with Crunchy Postgres for Kubernetes and Argo series. We will pick up from where we left off in part 1

    Read More