Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Jonathan S. Katz
Jonathan S. Katz
This post provides guidance for v4x. For the latest on PGO, GitOps and Helm installer, please see: https://github.com/CrunchyData/postgres-operator-examples/tree/main/helm In the previous article , we explored GitOps and how to apply GitOps concepts to PostgreSQL in a Kubernetes environment with the Postgres Operator and custom resources. The article went on to mention additional tooling that has been created to help employ GitOps principles within an environment, including Helm . While the m...
Read MorePaul Ramsey
Paul Ramsey
A surprisingly common problem in both application development and analysis is: given an input name, find the database record it most likely refers to. It's common because databases of names and people are common, and it's a problem because names are a very irregular identifying token. The page " Falsehoods Programmers Believe About Names " covers some of the ways names are hard to deal with in programming. This post will ignore most of those complexities, and deal with the problem of matching up...
Read MoreSteve Pousty
Steve Pousty
Today we are going to walk through some of the preliminary data shaping steps in data science using SQL in Postgres. I have a long history of working in data science , including my Masters Degree (in Forestry) and Ph.D. (in Ecology) and during this work I would often get raw data files that I had to get into shape to run analysis. Whenever you start to do something new there is always some uncomfortableness . That “why is this so hard” feeling often stops me from trying something new, but...
Read MoreKat Batuigas
Kat Batuigas
"I want to work on optimizing all my queries all day long because it will definitely be worth the time and effort," is a statement that has hopefully never been said. So when it comes to query optimizing, how should you pick your battles? Luckily, in PostgreSQL we have a way to take a system-wide look at database queries: • Which ones have taken up the most amount of time cumulatively to execute • Which ones are run the most frequently • And how long on average they take to execute Which ones ha...
Read MoreJoe Conway
Joe Conway
Recently I ran across grand sweeping statements that suggest containers are not ready for prime time as a vehicle for deploying your databases. The definition of "futile" is something like "serving no useful purpose; completely ineffective". See why I say this below, but in short, you probably are already, for all intents and purposes, running your database in a "container". Therefore, your resistance is futile. And I'm here to tell you that, at least in so far as PostgreSQL is concerned, those...
Read MoreKat Batuigas
Kat Batuigas
As a GIS newbie, I've been trying to use local open data for my own learning projects. I recently relocated to Tampa, Florida and was browsing through the City of Tampa open data portal and saw that they have a Public Art map . That looked like a cool dataset to work with but I couldn't find the data source anywhere in the portal. I reached out to the nice folks on the city's GIS team and they gave me an ArcGIS-hosted URL. To get the public art features into PostGIS I decided to use the "ArcG...
Read MoreJonathan S. Katz
Jonathan S. Katz
The desire to use Pod tolerations to schedule Postgres instances sometimes comes up around complex Kubernetes deployments. To address this feedback, we added support for tolerations to the 4.6 release of the Postgres Operator along with improvements to using node affinity . To use tolerations with PostgreSQL deployments, it helps to understand some of the mechanics behind several Kubernetes features to get the desired result of deploying PostgreSQL to a specific node group. Let's take a loo...
Read MoreJoe Conway
Joe Conway
If you run Linux in production for any significant amount of time, you have likely run into the "Linux Assassin" that is, the OOM ( out-of-memory ) killer. When Linux detects that the system is using too much memory, it will identify processes for termination and, well, assassinate them. The OOM killer has a noble role in ensuring a system does not run out of memory, but this can lead to unintended consequences. For years the PostgreSQL community has made recommendations on how to set up Lin...
Read MorePaul Laurence
Paul Laurence
There is increasing consensus that Postgres is a great choice of database for a broad range of use cases. As our friends at RedMonk have said: the answer is postgres, now what's the question again? ;-) — Elon Mook (@monkchips) April 29, 2017 You have a number of good options for how to run Postgres: run it in VMs, as a managed service or bare metal. Benjamin Good, a Google Cloud Solutions Architect, wrote a helpful blog post of when to run databases on Kubernetes ; a common question and increa...
Read MoreDave Cramer
Dave Cramer
My colleague @craigkerstiens recently wrote about some guidance for cleaning up your Postgres database . One of the things he mentioned in his post, "Don't put your logs or messages in your database." got a number of questions from people similar to: "But what do I do with my logs such as for an audit purpose?" Well there is a great answer and it does play really well with Postgres. The answer in many cases is, "Use Debezium." Debezium is built upon the Apache Kafka project and uses Kafka...
Read More