Fun with PostgreSQL Puzzles: Recursive Functions with Animations
Hands on Tutorial
We've also loaded a tutorial for Day 19's challenge if you want to try it with a pre-loaded data set.
Spoiler Alert
This article will contain spoilers both on how I solved 2022 Day 19's challenge "Not Enough Minerals" using SQL, as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language.
Not enough materials
Day 19 tasks us with creating lots and lots of mini robots to gather resources and feed our herd of elephants (rescued a few days back). We'll do this in SQL per usual, with the help of some items such as:
- More CTEs
- The handy regexp_split_to_array function
- Recursive plpgsql functions
- The greatest function
- Fun with exp / sum / ln
- More ANSI color codes and Unicode!
First, our usual setup to allow us to read the input file:
CREATE EXTENSION if not exists file_fdw;
CREATE SERVER if not exists aoc2022 foreign data wrapper file_fdw;
DROP SCHEMA if exists aoc_robot_rampage CASCADE;
CREATE SCHEMA aoc_robot_rampage;
SET search_path = aoc_robot_rampage;
CREATE FOREIGN TABLE aoc_day19 (line text)
SERVER aoc2022 options(filename '/tmp/aoc2022.day19.input');
AOC 2022 Day 19 Part 1
For this part of the challenge, we need to figure out the optimal number of geodes collected for each of the given blueprints. Geode robots gather geodes, and building a geode robot requires both ore and obsidian. There are also clay robots and ore robots. Starting with a single ore robot, and a finite number of rounds (24), we need to determine the optimal action of each step (build a type of robot, or do nothing). The input is multiple lines describing the costs for that particular blueprint, e.g.
Blueprint 1: Each ore robot costs 4 ore. Each clay robot costs 2 ore. Each obsidian robot costs 3 ore \
and 14 clay. Each geode robot costs 2 ore and 7 obsidian.
We will use SQL to extract that information, but first we will need a place to store it. Let's create a quick unlogged table and then populate it:
CREATE unlogged TABLE blueprint (
minute SMALLINT NOT NULL DEFAULT 0,
id INT NOT NULL PRIMARY KEY,
ore SMALLINT,
clay SMALLINT,
obsidian SMALLINT,
geodes SMALLINT,
ore_robot_cost SMALLINT,
clay_robot_cost SMALLINT,
obsidian_robot_cost SMALLINT, obsidian_robot_cost2 SMALLINT,
geode_robot_cost SMALLINT, geode_robot_cost2 SMALLINT,
ore_robots SMALLINT,
clay_robots SMALLINT,
obsidian_robots SMALLINT,
geode_robots SMALLINT
);
Every line of the input file follows the exact same format, so we can use the handy regexp_split_to_array
function to extract all the numbers, and then insert those into our table:
WITH x AS (SELECT regexp_split_to_array(line, '(\D+)') AS b FROM aoc_day19)
,y AS (SELECT 0 AS minute, b[2]::INT AS blueprint_id,
0 AS ore, 0 AS clay, 0 AS obsidian, 0 AS geodes,
b[3]::INT AS ore_robot_cost,
b[4]::INT AS clay_robot_cost,
b[5]::INT AS obs_robot_cost, b[6]::INT AS obs_robot_cost2,
b[7]::INT AS geode_robot_cost, b[8]::INT AS geode_robot_cost2,
1 AS ore_robots, 0 AS clay_robots, 0 AS obsidian_robots, 0 AS geode_robots
FROM x)
INSERT INTO blueprint SELECT * FROM y;
Now we will need a function to compute the best path for a given blueprint. The best way to do this is with a recursive function, in which we investigate the effects of each possible choice. During each round, we can either build one specific type of robot, or build no robot at all. With only four different types of robots, this may sound easy, but the number of choices available over those 24 rounds gets very large. A naïve approach with no other optimizations will actually cause an exception of ERROR: stack depth limit exceeded
which indicates we had too many calls to our recursive function. The basic function will look like this (pseudo-code):
myfunc:
ore_cost = myfunc(ore_robots++)
clay_cost = myfunc(clay_robots++)
obsidian_cost = myfunc(obsidian_robot++)
geode_cost = myfunc(geode_robots++)
nothing_cost = myfunc()
return greatest(ore_cost, clay_cost, obsidian_cost, geode_cost, nothing_cost)
During each round, the robots also gather resources, which are used (except for geodes) to build other robots, so the code needs to check for that as well:
myfunc:
if no_more_minutes
return geodes
if ore >= ore_robot_cost
ore_cost = myfunc(ore_robots++)
if ore >= clay_robot_cost
clay_cost = myfunc(clay_robots++)
if ore >= obsidian_robot_cost and clay >= obsidian_robot_cost2
obsidian_cost = myfunc(obsidian_robot++)
if ore >= geode_robot_cost and obsidian >= geode_robot_cost2
geode_cost = myfunc(geode_robots++)
nothing_cost = myfunc()
return greatest(ore_cost, clay_cost, obsidian_cost, geode_cost, nothing_cost)
That's the basic idea: walk through deeper and deeper until we run out of minutes, and then bubble up our results through the recursive callers. The "cost" is our ultimate goal: the number of geodes produced. We compare each of the five scenarios and return whichever has produced the greatest number of geodes for that round.
A big part of this puzzle was figuring out ways to reduce the number of times the function runs. For example, if we have 10 ore robots, which produce 10 ore each round, then there is no point in ever producing more ore robots if the most expensive robot costs less than 10 ore. Here are some other rules that I added (see how many you can think of first - I'll put a hint first, then the actual rule)
- Because our ultimate goal is to produce geodes ... we always want to create geode robots, so that choice always wins
- If we already have enough of a certain robot type to satisfy demand ... we never build any more of that robot type
- If we are nearing the end ... only build robots that could possibly lead to more geode robots
- If we did nothing last round, but we COULD HAVE built a robot type ... never build it this round: always build "greedy"
- If there are no geode robots yet ... building an obsidian robot is always the best choice
There were a few other rules that got considered but thrown out, as the above rules had a lot more bang for the buck. By far the most important one is the "don't build if we did nothing last round but could have built" as that saves over 1.4 million calls. Here is the final function, annotated with the savings for each of the added rules:
CREATE or replace FUNCTION give_me_the_remote(
ore INT, oldore INT, clay INT, oldclay INT, obsidian INT, geodes INT,
ore_robots INT, clay_robots INT, obsidian_robots INT, geode_robots INT,
ore_robot_cost INT,
clay_robot_cost INT,
obsidian_robot_cost INT, obsidian_robot_cost2 INT,
geode_robot_cost INT, geode_robot_cost2 INT,
minute INT, maxminute INT, last_action TEXT = ''
)
RETURNS INT language plpgsql AS $$
DECLARE
ore_score INT = 0;
clay_score INT = 0;
obsidian_score INT = 0;
geode_score INT = 0;
nobody_score INT = 0;
BEGIN
minute = minute + 1;
/* In the final minute, all we care about is gathering geodes */
IF minute >= maxminute THEN RETURN geodes + geode_robots; END IF;
/* If we can afford a geode robot, make it */
IF ore >= geode_robot_cost AND obsidian >= geode_robot_cost2 THEN
geode_score = give_me_the_remote(
ore + ore_robots - geode_robot_cost, ore,
clay + clay_robots, clay,
obsidian + obsidian_robots - geode_robot_cost2,
geodes + geode_robots,
ore_robots, clay_robots, obsidian_robots, geode_robots + 1,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
minute, maxminute, ''
);
/* This always wins, so no need to consider anything else */
RETURN geode_score; /* Improves by 290ms and 149,649 calls */
END IF;
/* If we have the option of making an obsidian robot, find out the cost */
IF ore >= obsidian_robot_cost AND clay >= obsidian_robot_cost2
/* Do not build if can never lead to a new geode robot by penultimate turn */
AND minute < maxminute-1 /* Improves by 34ms and -- 12,056 calls */
/* Do not build if we already have enough obsidians */
AND obsidian_robots < geode_robot_cost2 /* Minimal improvement */
/* If we could have built this last turn, but decided to do nothing, bail */
AND (last_action <> 'nada' OR oldore < obsidian_robot_cost OR oldclay < obsidian_robot_cost2)
/* Minimal improvement */
THEN
obsidian_score = give_me_the_remote(
ore + ore_robots - obsidian_robot_cost, ore,
clay + clay_robots - obsidian_robot_cost2, clay,
obsidian + obsidian_robots,
geodes + geode_robots,
ore_robots, clay_robots, obsidian_robots + 1, geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
minute, maxminute, ''
);
/* If we don't have any geode robots yet, this is always the best choice */
IF geode_robots < 1 THEN return geodes + obsidian_score; END IF;
/* Improves by 3187ms and 1,217,889 calls! */
END IF;
/* We might want to build a clay robot instead */
IF ore >= clay_robot_cost
/* Do not build if we failed to build it last time but could have */
AND (last_action <> 'nada' OR oldore < clay_robot_cost)
/* Improves by 85,000ms and 926,637 calls */
/* No point in building clay if we are close to the end */
AND minute < maxminute-8
/* Improves by 433ms and 161,726 calls */
THEN
clay_score = give_me_the_remote(
ore + ore_robots - clay_robot_cost, ore,
clay + clay_robots, clay,
obsidian + obsidian_robots,
geodes + geode_robots,
ore_robots, clay_robots + 1, obsidian_robots, geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
minute, maxminute, ''
);
END IF;
/* Perhaps building an ore robot? */
IF ore >= ore_robot_cost
/* Stop making ore_robots once we have enough to build most expensive robot */
AND ore_robots < GREATEST(
ore_robot_cost, clay_robot_cost, obsidian_robot_cost, geode_robot_cost)
/* Improves by 4656ms and 966,916 calls */
/* Do not build ore robots close to the end */
AND minute < maxminute-10
/* Improves by 104ms and 18,851 calls */
/* Do not build if we failed to build it last time but could have */
AND (last_action <> 'nada' OR oldore < ore_robot_cost)
/* Improves by 4236ms and 1,443,256 calls! */
THEN
ore_score = give_me_the_remote(
ore + ore_robots - ore_robot_cost, ore,
clay + clay_robots, clay,
obsidian + obsidian_robots,
geodes + geode_robots,
ore_robots + 1, clay_robots, obsidian_robots, geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
minute, maxminute, ''
);
END IF;
/* Always consider the case of building nothing */
nobody_score = give_me_the_remote(
ore + ore_robots, ore, clay + clay_robots, clay,
obsidian + obsidian_robots, geodes + geode_robots,
ore_robots, clay_robots, obsidian_robots, geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
minute, maxminute, 'nada'
);
RETURN GREATEST(nobody_score, ore_score, clay_score, obsidian_score);
END
$$;
This ran in 115ms and took around 30,000 calls! Not too shabby. The puzzle asked us to sum up the "quality" of each blueprint by finding the maximum number of geodes each can open, then multiplying that by the blueprint number. Summing all the quality numbers produces the final answer, which is simple in SQL:
SELECT SUM(id * give_me_the_remote(
ore,0,clay,0,obsidian,geodes,
ore_robots,clay_robots,obsidian_robots,geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost,
obsidian_robot_cost2, geode_robot_cost, geode_robot_cost2,
0, 24, '')
) FROM blueprint;
AOC 2022 Day 19 Part 2
Part 2 tries to increase the difficulty by changing the number of rounds from 24 to 32, while reducing the total blueprints to the first three. While 32 rounds is exponentially more difficult than 24, all our optimizations and rules above made it easy! The solution we need is the value of each of the first three blueprints multiplied together.
WITH x AS (SELECT give_me_the_remote(ore,ore,clay,clay,obsidian,geodes,
ore_robots,clay_robots,obsidian_robots,geode_robots,
ore_robot_cost, clay_robot_cost, obsidian_robot_cost, obsidian_robot_cost2,
geode_robot_cost, geode_robot_cost2, 0, 32) FROM blueprint WHERE id <= 3)
SELECT round( exp( sum( ln(give_me_the_remote) ) ) ) FROM x;
Postgres has no direct function to compute the product (i.e. the multiplication analog to sum), so we use some ln
/exp
math to do the equivalent of
AOC 2022 Day 19 Bonus Round
Our approach resulted in a lot of robots, so I wanted to find a way to visualize not only how many of each type were being made each round, but how often each robot "won" its respective rounds. For that, I made another table:
CREATE UNLOGGED TABLE robot_tracker (
action TEXT,
counter INT,
created time NOT NULL DEFAULT timeofday()::timestamptz::time
);
Then we sprinkle some INSERT
s into strategic points in our give_me_the_remote
function. Because there is a lot of recursion and multiple calls per transaction, we use the timeofday()
function rather than now()
, to get the exact time of insert, not the time the current transaction started. Once populated, we walk through the table and generate a graphic showing how many of each type of robot had been made at each point in time, as well as (in parenthesis) how many times that type of robot "won" by being the highest number in the GREATEST()
function at the end. By watching the output, you can see some of the rules in play. Note how the total number of ore robots stays very low, how important and popular the "nothing" choice is, and how the obsidian robots end up being a very important tipping point compared to the others. Here's the function to create our animation.
CREATE or replace FUNCTION like_a_robot()
RETURNS void language plpgsql AS $$
DECLARE
myrec RECORD; x INT = 0;
/* Total number of each type of robot created up to this point */
ores INT = 0; geodes INT = 0; obsidian INT = 0; clay INT = 0; nothing INT = 0;
/* Host many times each of these was the "best" */
bores INT = 0; bobsidian INT = 0; bclay INT = 0; bnothing INT = 0;
/* Various ANSI escape codes for coloring */
mytext TEXT = ''; myclear TEXT = E'\033[0m';
red TEXT = E'\x1b[38;5;196m';
orange TEXT = E'\x1b[38;5;214m';
yellow TEXT = E'\x1b[38;5;227m';
blue TEXT = E'\x1b[38;5;21m';
purplish TEXT = E'\x1b[38;5;165m';
green TEXT = E'\x1b[38;5;77m';
robotq TEXT; boxtop TEXT; boxbottom TEXT;
BEGIN
/* Parts of our image that do not change */
robotq = format('%s֍֍֍֍֍֍֍%s', red, myclear);
boxtop = format('%s╭───╩───╮%s', blue, myclear);
boxbottom = format('%s╰───╦───╯%s', blue, myclear);
/* Walk through by time and pull out what robot (if any) was built */
FOR myrec IN SELECT action,level FROM robot_tracker ORDER BY created LOOP
x = x + 1;
IF myrec.action = 'ore' THEN ores = ores+1; END IF;
IF myrec.action = 'best=o' THEN bores = bores+1; END IF;
IF myrec.action = 'clay' THEN clay = clay+1; END IF;
IF myrec.action = 'best=c' THEN bclay = bclay+1; END IF;
IF myrec.action = 'obsidian' THEN obsidian = obsidian+1; END IF;
IF myrec.action = 'best=ob' THEN bobsidian = bobsidian+1; END IF;
IF myrec.action = 'geode' THEN geodes = geodes+1; END IF;
IF myrec.action = '-' THEN nothing = nothing+1; END IF;
IF myrec.action = 'best=n' THEN bnothing = bnothing+1; END IF;
mytext = FORMAT('
%s%s (%s)
obsidian robots
%s
%s
%s
%s
%s
%s│%s%s│%s
%sore robots %s %s %s %s %s╣%s%s╠ %s%s %s %s %s clay robots%s
%s%3s %4s)%s %s│%s%7s%s│%s %s (%s)
%s
%s%s
%s
%s
%s
geode robots
%s (%s)%s
%sNOTHING: %s (%s)%s
',
/* For these forma arguments, we break our indenting rules a bit */
orange,
to_char(obsidian, 'FM999,999'), to_char(bobsidian, 'FM999,999'),
/* These CASEs are just to provide a little "movement" */
CASE WHEN obsidian%2=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN obsidian%4=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN obsidian%5=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN obsidian%9=0 THEN U&'\058D' ELSE U&'\263C' END,
boxtop,
blue, robotq, blue, myclear,
yellow,
CASE WHEN ores%2=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN ores%4=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN ores%5=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN ores%9=0 THEN U&'\058D' ELSE U&'\263C' END,
blue,
robotq,
blue,
purplish,
CASE WHEN clay%2=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN clay%4=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN clay%5=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN clay%9=0 THEN U&'\058D' ELSE U&'\263C' END,
myclear,
/* The FM removes excess whitspace, and the 999,999 formats large numbers nicely */
yellow, to_char(ores, 'FM999,999'), to_char(bores, 'FM(999,999'), myclear,
blue, myclear,
to_char(x, 'FM999,999'),
blue,myclear,
to_char(clay, 'FM999,999'), to_char(bclay, 'FM999,999'),
boxbottom,
green,
CASE WHEN geodes%2=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN geodes%4=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN geodes%5=0 THEN U&'\058D' ELSE U&'\263C' END,
CASE WHEN geodes%9=0 THEN U&'\058D' ELSE U&'\263C' END,
--U&'\263C',U&'\263C',U&'\263C', U&'\058D',
to_char(geodes, 'FM999,999'), to_char(geodes, 'FM999,999'),
myclear,
red,to_char(nothing, 'FM999,999'), to_char(bnothing, 'FM999,999'), myclear
); /* This is the end of the FORMAT way above */
/* Every 100 rows, we redraw the screen and introduce a small pause */
IF x%100 < 1 THEN
RAISE NOTICE '% %', E'\033c', mytext;
PERFORM pg_sleep(0.05);
END IF;
END LOOP;
END
$$;
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read