Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Latest posts from Craig Kerstiens

  • 5 min read

    Building a recommendation engine inside Postgres with Python and Pandas

    Craig Kerstiens

    I'm a big fan of data in general. Data can tell you a lot about what users are doing and can help you gain all sorts of insights. One such aspect is in making recommendations based on past history or others that have made similar choices. In fact, years ago I wrote a small app to see if I could recommend wines based on how other ones were rated. It was a small app that I shared among just a handful of friends, some with similar taste, some with different taste. At first it was largely an academic exercise of writing a recommendation engine, but if I could find some new wines I liked along the way, then great. Turns out it was a lot more effective at recommending things than I expected, even with only a small handful of wines rated.

    The other thing I'm a fan of is Postgres

    Read More
  • Announcing pgBackRest for Azure -  Fast, Reliable Postgres Backups

    Craig Kerstiens

    Backups are a key staple of running any database. Way back in the day, a good friend and colleague wrote one of the most used Postgres backup tools called wal-e. Wal-e was initially written in just a few days, and rolled out to the fleet of databases we managed in the early days at Heroku. We got pretty lucky with rolling that out, because shortly after we had there was the great AWS Apocalypse of 2011. This was a full day outage of AWS with lingering effects for nearly a week... Reddit was down, Netflix was down, so you couldn't even kill time waiting for things to come back up. At the time, AWS came back to us saying they couldn't recover a number of disks. Had it not been for wal-e and our disaster recovery setup customers would have lost data. Luckily no bytes of data were lost, and customers were back up and running much faster than had they been on RDS.

    Fast forward nearly 10 years and now there are numerous backup options for Postgres. And while wal-e was a great tool for it's time and place, it hasn't materially changed in the last 10 years. Enter pgBackRest

    Read More
  • Control Runaway Postgres Queries With Statement Timeout

    Craig Kerstiens

    Most queries against a database are short lived. Whether you're inserting a new record or querying for a list of upcoming tasks for a user, you're not typically aggregating millions of records or sending back thousands of rows to the end user. A typical short lived query in Postgres can easily be accomplished in a few milliseconds or less. For the typical application, this means a well tuned production Postgres

    Read More
  • Using Composite Types within Postgres

    Craig Kerstiens

    At a company where most all people have some Postgres expertise you can easily learn something new from your coworkers every day about Postgres. In my first week I saw a question in our internal slack that I could guess an answer to, but it wasn't definitive.

    It was "Why have composite types? Why would you use them?". I threw in an answer a few others did as well, but collectively we didn't have anything definitive but all these seemed like valid cases.

    But first, what are composite types?

    Read More