Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

  • 4 min read

    Helm, GitOps and the Postgres Operator

    Jonathan S. Katz

    This post provides guidance for v4x. For the latest on PGO, GitOps and Helm installer, please see: https://github.com/CrunchyData/postgres-operator-examples/tree/main/helm In the previous article , we explored GitOps and how to apply GitOps concepts to PostgreSQL in a Kubernetes environment with the Postgres Operator and custom resources. The article went on to mention additional tooling that has been created to help employ GitOps principles within an environment, including Helm . While the m...

    Read More
  • 6 min read

    Fuzzy Name Matching in Postgres

    Paul Ramsey

    A surprisingly common problem in both application development and analysis is: given an input name, find the database record it most likely refers to. It's common because databases of names and people are common, and it's a problem because names are a very irregular identifying token. The page " Falsehoods Programmers Believe About Names " covers some of the ways names are hard to deal with in programming. This post will ignore most of those complexities, and deal with the problem of matching up...

    Read More
  • 9 min read

    Using PostgreSQL to Shape and Prepare Scientific Data

    Steve Pousty

    Today we are going to walk through some of the preliminary data shaping steps in data science using SQL in Postgres. I have a long history of working in data science , including my Masters Degree (in Forestry) and Ph.D. (in Ecology) and during this work I would often get raw data files that I had to get into shape to run analysis. Whenever you start to do something new there is always some uncomfortableness . That “why is this so hard” feeling often stops me from trying something new, but...

    Read More
  • Query Optimization in Postgres with pg_stat_statements

    Kat Batuigas

    "I want to work on optimizing all my queries all day long because it will definitely be worth the time and effort," is a statement that has hopefully never been said. So when it comes to query optimizing, how should you pick your battles? Luckily, in PostgreSQL we have a way to take a system-wide look at database queries: • Which ones have taken up the most amount of time cumulatively to execute • Which ones are run the most frequently • And how long on average they take to execute Which ones ha...

    Read More
  • 9 min read

    Deep PostgreSQL Thoughts: Resistance to Containers is Futile

    Joe Conway

    Recently I ran across grand sweeping statements that suggest containers are not ready for prime time as a vehicle for deploying your databases. The definition of "futile" is something like "serving no useful purpose; completely ineffective". See why I say this below, but in short, you probably are already, for all intents and purposes, running your database in a "container". Therefore, your resistance is futile. And I'm here to tell you that, at least in so far as PostgreSQL is concerned, those...

    Read More
  • 8 min read

    ArcGIS Feature Service to PostGIS: The QGIS Way

    Kat Batuigas

    As a GIS newbie, I've been trying to use local open data for my own learning projects. I recently relocated to Tampa, Florida and was browsing through the City of Tampa open data portal and saw that they have a Public Art map . That looked like a cool dataset to work with but I couldn't find the data source anywhere in the portal. I reached out to the nice folks on the city's GIS team and they gave me an ArcGIS-hosted URL. To get the public art features into PostGIS I decided to use the "ArcG...

    Read More
  • 5 min read

    Kubernetes Pod Tolerations and Postgres Deployment Strategies

    Jonathan S. Katz

    The desire to use Pod tolerations to schedule Postgres instances sometimes comes up around complex Kubernetes deployments. To address this feedback, we added support for tolerations to the 4.6 release of the Postgres Operator along with improvements to using node affinity . To use tolerations with PostgreSQL deployments, it helps to understand some of the mechanics behind several Kubernetes features to get the desired result of deploying PostgreSQL to a specific node group. Let's take a loo...

    Read More
  • Deep PostgreSQL Thoughts: The Linux Assassin

    Joe Conway

    If you run Linux in production for any significant amount of time, you have likely run into the "Linux Assassin" that is, the OOM ( out-of-memory ) killer. When Linux detects that the system is using too much memory, it will identify processes for termination and, well, assassinate them. The OOM killer has a noble role in ensuring a system does not run out of memory, but this can lead to unintended consequences. For years the PostgreSQL community has made recommendations on how to set up Lin...

    Read More
  • The Answer is Postgres; The Question is How?

    Paul Laurence

    There is increasing consensus that Postgres is a great choice of database for a broad range of use cases. As our friends at RedMonk have said: the answer is postgres, now what's the question again? ;-) — Elon Mook (@monkchips) April 29, 2017 You have a number of good options for how to run Postgres: run it in VMs, as a managed service or bare metal. Benjamin Good, a Google Cloud Solutions Architect, wrote a helpful blog post of when to run databases on Kubernetes ; a common question and increa...

    Read More
  • Change Data Capture in Postgres With Debezium

    Dave Cramer

    My colleague @craigkerstiens recently wrote about some guidance for cleaning up your Postgres database . One of the things he mentioned in his post, "Don't put your logs or messages in your database." got a number of questions from people similar to: "But what do I do with my logs such as for an audit purpose?" Well there is a great answer and it does play really well with Postgres. The answer in many cases is, "Use Debezium." Debezium is built upon the Apache Kafka project and uses Kafka...

    Read More