Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Latest posts from Christopher Winslett

  • 10 min read

    Creating Histograms with Postgres

    Elizabeth ChristensenChristopher Winslett

    Histograms were first used in a lecture in 1892 by Karl Pearson — the godfather of mathematical statistics. With how many data presentation tools we have today, it’s hard to think that representing data as a graphic was classified as “innovation”, but it was. They are a graphic presentation of the distribution and frequency of data. If you haven’t seen one recently, or don’t know the word histogram off the top of your head - it is a bar chart, each bar represents the count of data with a defined...

    Read More
  • 8 min read

    8 Steps in Writing Analytical SQL Queries

    Christopher Winslett

    It is never immediately obvious how to go from a simple SQL query to a complex one -- especially if it involves intricate calculations. One of the “dangers” of SQL is that you can create an executable query but return the wrong data. For example, it is easy to inflate the value of a calculated field by joining to multiple rows. Use Crunchy Playground to follow allow with this blog post using a Postgres terminal: Postgres Playground w/ Sample Data Let’s take a look at a sample query. This appears...

    Read More
  • 8 min read

    4 Ways to Create Date Bins in Postgres: interval, date_trunc, extract, and to_char

    Christopher Winslett

    You followed all the best practices, your sales dates are stored in perfect timestamp format …. but now you need to get reports by day, week, quarters, and months. You need to bin, bucket, and roll up sales data in easy to view reports. Do you need a BI tool? Not yet actually. Your Postgres database has hundreds of functions that let you query data analytics by date. By using some good old fashioned SQL - you have powerful analysis and business intelligence with date details on any data set. In...

    Read More
  • 6 min read

    Using acts_as_tenant for Multi-tenant Postgres with Rails

    Christopher Winslett

    Since its launch, Ruby on Rails has been a preferred open source framework for small-team B2B SaaS companies. Ruby on Rails uses a conventions-over-configuration mantra. This approach reduces common technical choices, thus elevating decisions. With this approach, the developers get an ORM (ActiveRecord), templating engine (ERB), helper methods (like ), controller (ActiveController), directory setup defaults ( ), authentication methods ( ), and more. Multi-tenant is the backbone of B2B SaaS pro...

    Read More
  • Use Github Actions on Pull Requests to Automate Postgres on Crunchy Bridge

    Christopher Winslett

    Automating pull requests to deploy staging applications is a game changer for large teams performing shipping quality products. Using Crunchy Bridge ’s CLI or API, you can easily automate the entire process for these staging deployments. The simplest workflow would look something like the following: In this example, during the “Create Postgres Cluster”, we’ll create a hobby-0 cluster for Postgres. Then, when the PR is closed the cluster will be torn down. We keep it simple for this example, bu...

    Read More
  • 3 min read

    Ruby on Rails Neighbor Gem for AI Embeddings

    Christopher Winslett

    Over the past 12 months, AI has taken over budgets and initiatives. Postgres is a popular store for AI embedding data because it can store, calculate, optimize, and scale using the pgvector extension . A recently introduced gem to the Ruby on Rails ecosystem, the neighbor gem, makes working with pgvector and Rails even better. An “embedding” is a set of floating point values that represent the characteristics of a thing (nothing new, we’ve had these since the 70s). Using the OpenAI API or any o...

    Read More
  • 6 min read

    Postgres Goodies in Ruby on Rails 7.1

    Christopher Winslett

    I just spent last week at Rails World in Amsterdam and had a blast digging back into the Rails and Active Record world. In conversations with developers over the week, I had some notable takeaways from the newest version of Ruby on Rails that I just had to get written up. A quick summary before we dig in: • async queries : send long-running queries the background while the code runs along, great for pages with multiple long-running queries that can be run in parallel • composite primary keys : n...

    Read More
  • 6 min read

    JSON Updates in Postgres 16

    Christopher Winslett

    Postgres has been steadily building on the JSON functionality initially released more than 10 years ago . With Postgres 16, working with JSON has gotten a couple nice improvements. Primarily, this release added features that ease the manipulation of data into JSON and improve the standard SQL functionality using JSON. TL;DR: • A SQL/JSON data-type check. For instance, this lets you ask with SQL if something • Addition of SQL-standard JSON functions: , , , and A SQL/JSON data-type check. For...

    Read More
  • 12 min read

    HNSW Indexes with Postgres and pgvector

    Christopher Winslett

    Postgres’ pgvector extension recently added HNSW as a new index type for vector data. This levels up the database for vector-based embeddings output by AI models. A few months ago, we had written about approximate nearest neighbor pgvector performance using the available list-based indexes . Now, with the addition of HNSW, pgvector can use the latest graph based algorithms to approximate nearest neighbor queries. As with all things databases, there are trade-offs, so don’t throw away the list...

    Read More
  • 12 min read

    Scaling Vector Data with Postgres

    Christopher Winslett

    Note: We have additional articles in this Postgres AI series . Vector data has made its way into Postgres and I’m seeing more and more folks using it by the day. As I’ve seen use cases trickle in, I have been thinking a lot about scaling data and how to set yourself up for performance success from the beginning. The two primary trade-offs are performance versus accuracy. When seeking performance with vector data, we are using nearest neighbor algorithms, and those algorithms are built around p...

    Read More