Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Easy Totals and Subtotals in Postgres with Rollup and Cube

Avatar for Elizabeth Christensen

Elizabeth Christensen

5 min read

Postgres is being used more and more for analytical workloads. There’s a few hidden gems I recently ran across that are really handy for doing SQL for data analysis, ROLLUP and CUBE. Rollup and cube don’t get a lot of attention, but follow along with me in this post to see how they can save you a few steps and enhance your date binning and summary reporting.

We also have a web based tutorial that covers Postgres Functions for Rolling Up Data by Date if you want to try it yourself with a sample data set.

Superpowered Group By

Before we dig into rollup and cube, let’s look at GROUP BY statements. Here’s an example query where I want to get totals for all my months of sales and categories.

Using to_char with date_trunc I can roll up things by month. With GROUP BY I can rollup data by category.

-- Get totals for each month and category
SELECT
    to_char(date_trunc('month', order_date), 'FMMonth YYYY') AS month,
    category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders
GROUP BY
    date_trunc('month', order_date), category
ORDER BY
    date_trunc('month', order_date), category;
     month      |  category   | total_orders | total_amount
----------------+-------------+--------------+--------------
 October 2021   | Books       |            3 |      2375.73
 October 2021   | Clothing    |           18 |     13770.09
 October 2021   | Computers   |           17 |     13005.87

If I wanted to get subtotals for years, I would have to pull this into Excel or write separate select statements and unions. You would have to add a lot more SQL in snippets and section like this:

-- Get total for each month
....
UNION ALL

SELECT
    to_char(date_trunc('month', order_date), 'FMMonth YYYY') AS month,
    NULL AS category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders
GROUP BY
    date_trunc('month', order_date)
ORDER BY
    date_trunc('month', order_date)

-- Get grand total across all months and categories
...
UNION ALL

SELECT
    NULL AS month,
    NULL AS category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders

ROLLUP and CUBE will do the subtotals all for you though! Let’s take a closer look.

GROUP BY ROLLUP

ROLLUP is an extension you can add to the GROUP BY clause. When you use it, ROLLUP will give you both individual bins of totals and a sub-total. Here’s an example where I just add ROLLUP to my group by.

SELECT
    to_char(date_trunc('month', order_date), 'FMMonth YYYY') AS month,
    category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders
GROUP BY
    ROLLUP (date_trunc('month', order_date), category)
ORDER BY
    date_trunc('month', order_date), category;

If you see in this example, the sub-total has a null value for the category.

     month      |  category   | total_orders | total_amount
----------------+-------------+--------------+--------------
 October 2021   | Books       |            3 |      2375.73
 October 2021   | Clothing    |           18 |     13770.09
 October 2021   | Computers   |           17 |     13005.87
 October 2021   | Electronics |           25 |     16358.96
 October 2021   |             |           63 |     45510.65

These null values represent the subtotals.

postgres rollup.png

You can probably see that this is a really handy function for reporting. Rollup will give you big batches of things, including things with NULL values. If you want to do a quick survey of your products or data by certain categories, rollup is a great tool. You can combine that with date_trunc to get a rollup of categories by any date bin.

GROUP BY CUBE

The CUBE function takes the rollup one step further and does subtotals and grand totals across all the dimensions you’ve queried. Very similar to ROLLUP, we can look at both dates and categories of sales. Again, in this example, I just add CUBE to the GROUP BY statement.

SELECT
    to_char(date_trunc('month', order_date), 'FMMonth YYYY') AS month,
    category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders
GROUP BY
    CUBE (date_trunc('month', order_date), category)
ORDER BY
    date_trunc('month', order_date), category;
      month      |  category   | total_orders | total_amount
----------------+-------------+--------------+--------------
 October 2024   | Books       |            9 |      5574.92
 October 2024   | Clothing    |           19 |     11856.80
 October 2024   | Computers   |           22 |     13002.10
 October 2024   | Electronics |           50 |     34251.83
 October 2024   |             |          100 |     64685.65
                | Books       |          521 |    328242.79
                | Clothing    |         1133 |    739866.25
                | Computers   |         1069 |    680817.70
                | Electronics |         2709 |   1707713.80
                |             |         5432 |   3456640.54

Like ROLLUP these sub-totals aren’t labeled, they have null values representing the totals, like this:

postgres cube

Label ROLLUP and CUBE totals with COALESCE

I find these null values for the sub-totals kind of strange. If you’re like more, or sharing these raw reports with several people, you might want labels instead of these null values. You can use the COALESCE function to do some basic renaming of the null values. COALESCE is commonly used in cases like these when you want to handle NULL values in queries.

Here’s a sample where COALESCE comes before each category and time bin so when we add the group by cube below, the sub-totals are labeled.

SELECT
    COALESCE(to_char(date_trunc('month', order_date), 'FMMonth YYYY'), 'Grand Total') AS month,
    COALESCE(category, 'Subtotal') AS category,
    COUNT(*) AS total_orders,
    SUM(total_amount) AS total_amount
FROM
    orders
GROUP BY
    CUBE (date_trunc('month', order_date), category)
ORDER BY
    date_trunc('month', order_date), category;
     month      |  category   | total_orders | total_amount
----------------+-------------+--------------+--------------
 October 2024   | Books       |            9 |      5574.92
 October 2024   | Clothing    |           19 |     11856.80
 October 2024   | Computers   |           22 |     13002.10
 October 2024   | Electronics |           50 |     34251.83
 October 2024   | Subtotal    |          100 |     64685.65
 Grand Total    | Books       |          521 |    328242.79
 Grand Total    | Clothing    |         1133 |    739866.25
 Grand Total    | Computers   |         1069 |    680817.70
 Grand Total    | Electronics |         2709 |   1707713.80
 Grand Total    | Subtotal    |         5432 |   3456640.54

Summary

If you need to do date binning or rollups for your data by date, check out rollup and cube. They’re super easy additions to the GROUP BY function that will do your subtotals and grand totals.