How to Solve Advent of Code 2022 Using Postgres...

Time for the annual Advent of Code challenge for 2022! If you are not familiar with AOC, every December a new puzzle is added each day, getting harder as it goes along, and the challenge is to solve it any way you can by transforming an input file into an answer. I'm going to use PostgreSQL and solve things solely with the use of SQL.

Spoiler Alert

This article will contain spoilers both on how I solved Day 1's challenge "Calorie Counting", as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language. This article is delayed from the actual puzzle's release. Also note that my solutions are seldom going to be the "best" solutions - they were solved as quickly as possible, and these articles will show my first solutions, with some minor reformatting and cleaning up.

AOC Day 1

I am going to attempt to solve these challenges using SQL. Specifically, using Postgres and all its wonderful tricks to get as far as I can. Let's jump right into Day 1. The first challenge is to find some information about a text file that looks like this:

We need to find the group of items (an empty line starts a new group) with the highest number. The bits of Postgres tech used to solve this puzzle are:

file_fdw, a built-in foreign data wrapper for Postgres
The CASE command
A sequence to help us separate the rows into groups
The aggregate function sum inside a window to allow advanced counting

Hands on Tutorial

We've also loaded a tutorial for Day's 1 challenge if you want to try it with a pre-loaded data set.

Reading the input

The first step is to download the input and save it to a local file. In this case, it is 2,250 lines long and stored as /tmp/aoc2022.day1.input

While we could use the COPY command to slurp it in, this is an excellent case for the file_fdw extension, which allows you to read in external files as if they were Postgres tables! This is done with the amazing FDW (foreign data wrapper) system. So, from inside the psql prompt, we need to install the extension, create a foreign server, and then create a foreign table that points to our text file:

CREATE EXTENSION file_fdw;

CREATE SERVER aoc2022 foreign data wrapper file_fdw;

CREATE FOREIGN TABLE aoc_day1 (calorie int)
  SERVER aoc2022 options(filename '/tmp/aoc2022.day1.input', null '');

Note that we also needed to explicitly tell Postgres that an empty line should become a null. Without that option, the table would throw an error when we tried to read it. Now that we have a table with one row per line, how can we group it? Grouping similar rows together is a job for window functions. In this case, we need to somehow put all rows together until we hit a null, then we need to start a new group. Let's create a pseudo-column and use a sequence to increment it only when we find a null value in our "calorie" column:

CREATE SEQUENCE aoc;

-- Call it once so that `currval` works. The starting number can be anything.
SELECT setval('aoc', 1);

SELECT calorie, CASE WHEN calorie is null THEN nextval('aoc') ELSE currval('aoc') END
FROM aoc_day1;

 calorie | currval
---------+---------
    4428 |       1
    3773 |       1
    1076 |       1
       ☃ |       2
    8063 |       2
   10386 |       2
       ☃ |       3
    5705 |       3
    8397 |       3
    1084 |       3
    7661 |       3

Now we can group those together, and find out the sum of each one by using the SUM OVER() syntax. To keep it simple, we will use an ugly subselect:

SELECT SUM(calorie) OVER(partition by currval) FROM
  (SELECT calorie, CASE WHEN calorie is null THEN nextval('aoc') ELSE currval('aoc') END
   FROM aoc_day1
  ) x;

  sum
-------
 9277
 9277
 9277
 18449
 18449
 18449
 22847
 22847
 22847
 22847
 22847

Note that we don't care which row it came from, nor do we care about the duplicate entries. All we care about is the maximum value in this new table. Rather than a subselect, let's use a CTE to make things look better. Since one of the goals is to do this in as few SQL statements as possible, let's put the setval() into the top of the CTE:

WITH
setup AS (SELECT setval('aoc',1)),
    x AS (SELECT calorie, CASE WHEN calorie is null THEN nextval('aoc') ELSE currval('aoc') END
          FROM setup, aoc_day1),
    y AS (SELECT sum(calorie) OVER(partition by currval) from x)
SELECT max(sum) FROM y;

  max
-------
 22847

Voila! This returns the highest combined number for all the groups. The actual puzzle had 2,250 lines of input, but that was no problem for Postgres. Onwards to Day 2 of the Advent of Code challenge.

Latest Articles

How to Solve Advent of Code 2022 Using Postgres - Day 1

Spoiler Alert

AOC Day 1

Hands on Tutorial

Reading the input

Related Articles