Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Elizabeth Christensen
Elizabeth Christensen
We recently gave a talk at SCaLE (Southern California Linux Expo) about common problems and solutions for managing large Postgres databases. One of the topics we covered was data skewing and partial indexing. This piqued some conference discussion afterwards so we wanted to do a deeper dive. Skewed data is when your data is kind of bunched up - essentially it is not evenly distributed. You might have one really large customer with a customer id that takes up more than half the rows in your eve...
Read MoreDavid Steele
David Steele
Crunchy Data is proud to support the pgBackRest project, an essential production grade backup tool used in our fully managed and self managed Postgres products. pgBackRest is also available as an open source project. pgBackRest provides: • Full, differential, and incremental backups • Checksum validation of backup integrity • Point-in-Time recovery Full, differential, and incremental backups Checksum validation of backup integrity Point-in-Time recovery pgBackRest recently released v2.46 wi...
Read MoreChristopher Winslett
Christopher Winslett
We have been talking a lot here about using Postgres for metrics, dashboards, and analytics . One of my favorite Postgres tools that makes a lot of this work easy and efficient is Hyperloglog ( HLL ). Hyperloglog is like Regex, once you understand it -- you feel like it's a superpower. Also, like Regex -- it can't solve everything. In this post I’ll take you through how to get started with HLL and build some sample queries, and get started with simple tuning. Hyperloglog is a compression and...
Read MoreRoberto Mello
Roberto Mello
Postgres 16 is hot off the press with the beta release last week. I am really excited about the new feature that allows logical replication from standbys, allowing users to: • create logical decoding from a read-only standby • reduce the workload on the primary server • have new ways to achieve high-availability for applications that require data synchronization across multiple systems or for auditing purposes create logical decoding from a read-only standby reduce the workload on the primary se...
Read MoreGreg Sabino Mullane
Greg Sabino Mullane
A question I hear from time to time with Crunchy Data clients and the Postgres community is: When was my Postgres database table created? Postgres does not store the creation date of tables, or any other database object. But fear not, there are a plethora of direct and indirect ways to find out when your table creation happened. Let's go through some ways to do this, ranging from easy to somewhat hard. All these solutions apply to indexes and other database objects, but tables are by far the mos...
Read MoreCraig Kerstiens
Craig Kerstiens
Over the past few weeks we've had several customers ask how they should architect their analytics pipeline. Common questions are: • Should we use some kind of data warehouse or time series database? • Is Postgres suitable for that type of workload? • What are the pitfalls that I should worry about before I get started? Should we use some kind of data warehouse or time series database? Is Postgres suitable for that type of workload? What are the pitfalls that I should worry about before I get sta...
Read MoreCraig Kerstiens
Craig Kerstiens
Today, we wanted to address some basic principles for better managing data architecture. Postgres is well regarded as a database for traditional system of record. More recently we've been fielding questions on what else can it do, such as: Can it be good for analytics and metrics? The short answer is "yes". When applications expand outside their standard system of record, they add in new types of data and data stores, which introduces complexity managing multiple types of systems. Some common wo...
Read MoreGreg Sabino Mullane
Greg Sabino Mullane
In an earlier post, I went into a lot of detail about unlogged tables . But tables are not the only thing to get the unlogged treatment - as of version 15 of Postgres, sequences can be unlogged as well! If you want to create your own, it's simply a matter of adding the keyword to your statement: The use case for unlogged sequences in Postgres is primarily to keep the sequence data for an unlogged table out of the WAL stream. Although unlogged tables provide a significant performance boost,...
Read MoreCraig Kerstiens
Craig Kerstiens
Is your database ready for production? You've been building your application for months, you've tested with beta users, you've gotten feedback and iterated. You've gone through your launch checklist, email beta users, publish the blog post, post to hacker news and hope the comments are friendly. But is your database ready for whatever may come on launch day or even 2 months in? Here's a handy checklist to make sure you're not caught flat footed. • Backups❓ • High availability❓ • Logs properly co...
Read MoreJesse Soyland
Jesse Soyland
Integer overflow occurs when a computer program tries to store an integer but the value being stored exceeds the maximum value that can be represented by the data type being used to store it. We have helped a few Crunchy Data clients navigate this recently and wanted to write up some notes. In Postgres, there are three integer types: • - A 2-byte integer, -32768 to 32767 • - A 4-byte integer, -2147483648 to 2147483647 • - An 8-byte integer, -9223372036854775808 to +9223372036854775807 - A 2-by...
Read More