Fun with Postgres Floats, Positioning, and Sequencing
Disclaimer
This article will contain spoilers both on how I solved 2022 Day 20's challenge "Grove Positioning System" using SQL, as well as general ideas on how to approach the problem. I recommend trying to solve it yourself first, using your favorite language. Will I get these all posted before next year's AOC starts? Consider it a bonus challenge! :)
AOC Day 20
Tech used:
- CTEs (Common Table Expressions)
- Using a non-integer type to help simulate a linked list
- The ever useful file_fdw extension
- sequences
- The built-in mod(https://www.postgresql.org/docs/current/functions-math.html) function and a custom implementation!
- Using CALL to implement our stored procedures
As with the other days, there is some general setup to get a FDW to read the file.
CREATE EXTENSION if not exists file_fdw;
CREATE SERVER if not exists aoc2022 foreign data wrapper file_fdw;
DROP SCHEMA if exists aoc2022_day20_decrypt CASCADE;
CREATE SCHEMA aoc2022_day20_decrypt;
SET search_path = aoc2022_day20_decrypt;
/* I found this commented line was the easiest way to toggle test/real data: */
CREATE FOREIGN TABLE aoc_day20 (val bigint)
SERVER aoc2022 options(filename '/tmp/aoc2022.day20.input');
-- SERVER aoc2022 options(filename '/tmp/aoc2022.day20.testinput');
For this challenge, we have an encrypted string that we need to decrypt with some very particular rules:
The encrypted file is a list of numbers. To mix the file, move each number
forward or backward in the file a number of positions equal to the value
of the number being moved. The list is circular, so moving a number off one
end of the list wraps back around to the other end as if the ends were connected.
To start, let's create a table to hold our information. We need to keep track of where each number starts, and where it will end up. Because the virtual table created by file_fdw is strictly read-only, we will keep that as the "initial position" table and create a new one to track changes. To speed things up, we should use an unlogged table, and to prevent autovacuum from firing, we disable it for this table. We will use a sequence to insert the items in the order they first appear:
CREATE SEQUENCE aoc;
CREATE UNLOGGED TABLE puzzle (
val BIGINT,
slot FLOAT /* Why a float? Keep reading... */
) WITH (autovacuum_enabled = off);
INSERT INTO puzzle SELECT val, nextval('aoc') FROM aoc_day20;
CREATE UNIQUE INDEX puzzle_slot ON puzzle(slot);
Next, we need a procedure to do the actual decryption via "mixing" according to the rules of the contest. We will walk through all the numbers in order, and shift them to their new location based on their value. The number needs to get moved somewhere between two other numbers. In many languages, a linked list is the obvious solution. Since we are doing this in SQL, we will instead set the position of the number to something between the two other number's positions. Hence the use of the data type float above, which allows us to subdivide numbers. Note: float is actually a synonym for double precision, but quicker to type.
In other words, if we need to move something between 18 and 20, we can assign it a slot of 19. If we need to stick it between 18 and 19, we assign it a slot of 18.5. If we need something between 18 and 18.5, we assign it a slot of 18.25, and so on. This trick allows us to still keep things in order, while maintaining a very large pool of potential values. Here is the complete procedure:
CREATE or replace PROCEDURE mixit()
language plpgsql AS $$
DECLARE
maxslots SMALLINT; slotcount SMALLINT; myrec RECORD;
y FLOAT; z FLOAT; halfval FLOAT;
BEGIN
/* We need to know when to "roll over" to the other side */
SELECT INTO maxslots count(*)-1 FROM puzzle;
/* Always set this back to 1 to be safe, so we can re-run this function at will */
PERFORM setval('aoc', 1, false);
/* This is using file_fdw, so unlike a regular table, we don't need to worry about an ORDER BY! */
FOR myrec IN SELECT val, nextval('aoc') AS slot FROM aoc_day20 LOOP
/* A value of 0 moves no spaces, so we simply ignore it */
IF myrec.val = 0 THEN CONTINUE; END IF;
/*
Postgres does truncated division for mod, which is not the approach we need here,
so we do it ourselves for negative numbers!
*/
myrec.val = CASE WHEN myrec.val < 0
THEN myrec.val - (maxslots * floor( myrec.val::float / maxslots))
ELSE mod(myrec.val, maxslots)
END;
/* Find the slot that is X more than our current position */
SELECT INTO y slot FROM puzzle WHERE slot >= myrec.slot
ORDER BY slot LIMIT 1 OFFSET myrec.val;
IF y IS NULL THEN
/* No slot found, so we fell off the right end. Circle to the front */
SELECT INTO slotcount count(*) FROM puzzle WHERE slot > myrec.slot;
myrec.val = myrec.val - slotcount;
/* Grab our new left and right boundaries */
SELECT INTO y slot FROM puzzle WHERE slot <> myrec.slot
ORDER BY slot ASC LIMIT 1 OFFSET (myrec.val)-1;
SELECT INTO z slot FROM puzzle WHERE slot <> myrec.slot
ORDER BY slot ASC LIMIT 1 OFFSET (myrec.val);
ELSE
/* We found a left boundary, can we find a matching right one? */
SELECT INTO z slot FROM puzzle WHERE slot >= myrec.slot
ORDER BY slot LIMIT 1 OFFSET myrec.val+1;
IF z IS NULL THEN
/* We ran off the right edge - simply move this to the end */
SELECT INTO slotcount max(slot) from puzzle;
UPDATE puzzle SET slot = slotcount+1 WHERE slot = myrec.slot;
CONTINUE;
END IF;
END IF;
/*
Create a value that is halfway between our left and right slots.
Because this is a float, we can always find a unique number.
Technically, it just needs to be between, not halfway...
*/
SELECT INTO halfval (z+y)/2;
/* Finally, we can set the value to the new position in the read-write table: */
UPDATE puzzle SET slot=halfval WHERE slot = myrec.slot;
END LOOP;
END
$$;
Because there is no data to return, we made this a procedure instead of a function. The way to run a procedure is:
CALL mixit();
After the decryption of the file, there is still one more step to generate the answer. The rules say:
the grove coordinates can be found by looking at the
1000th, 2000th, and 3000th numbers after the value 0,
wrapping around the list as necessary.
While we could solve this programmatically, the lazy way is to make a table that has at least 3000 rows after the last possible row in the original table. We'll create a quick temp table for that by doubling the original table (which has 5000 rows), making sure we maintain the order:
CREATE TABLE bigpuzzle AS SELECT * FROM puzzle ORDER BY slot;
INSERT INTO bigpuzzle SELECT * FROM puzzle ORDER BY slot;
Finally, we can solve it with a quick CTE. At the top level, we find the row that has a value of 0, and store the CTID for that row. Then, we make three more sections to grab the values at x+1000, x+2000, and x+3000. Once we have all those, we sum them together to get our final answer:
WITH x AS (SELECT ctid AS c FROM bigpuzzle WHERE val=0 ORDER BY ctid ASC LIMIT 1)
,y1 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 1000)
,y2 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 2000)
,y3 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 3000)
SELECT y1.val+y2.val+y3.val AS aoc2022_day20_part1 FROM y1,y2,y3;
aoc2022_day20_part1
----------------------
19070
AOC Day 20 - Part Two
Part Two adds two new rules: a base number multiplier, and a process multiplier:
First, you need to apply the decryption key, 811589153.
Multiply each number by the decryption key before you begin;
this will produce the actual list of numbers to mix.
Second, you need to mix the list of numbers ten times.
The order in which the numbers are mixed does not change
during mixing; the numbers are still moved in the order
they appeared in the original, pre-mixed list
We can re-use our original puzzle table for this. The first step is to wipe it clean and repopulate, but multiply each number by that "decryption key" as it goes in:
SET aoc.decryption_key = 811589153;
TRUNCATE TABLE puzzle;
ALTER TABLE puzzle ADD COLUMN id SMALLSERIAL;
CREATE INDEX puzzle_id ON puzzle(id);
SELECT setval('aoc',1,false);
INSERT INTO puzzle(val,slot)
SELECT val * current_setting('aoc.decryption_key')::int, nextval('aoc') FROM aoc_day20;
Next we need a new procedure. This one is almost identical to the previous one, with the change that any values coming from our file_fdw table aoc_day20 get multiplied by the decryption key:
CREATE or replace PROCEDURE sir_mix_a_slot()
language plpgsql AS $$
DECLARE
maxslots SMALLINT; slotcount SMALLINT; myrec RECORD;
y FLOAT; z FLOAT; halfval FLOAT; currslot FLOAT;
BEGIN
SELECT INTO maxslots count(*)-1 FROM puzzle;
PERFORM setval('aoc',1,false);
FOR myrec IN SELECT val, nextval('aoc') AS id FROM aoc_day20 LOOP
IF myrec.val = 0 THEN CONTINUE; END IF;
/* Make our numbers much much bigger, because they said so */
myrec.val = myrec.val * current_setting('aoc.decryption_key')::bigint;
myrec.val = CASE WHEN myrec.val < 0
/* Special case as Postgres does a weird mod(-X,Y) */
THEN myrec.val - (maxslots * floor( myrec.val::float / maxslots))
ELSE mod(myrec.val, maxslots)
END;
/* Find the slot value of the one we are adjusting */
SELECT INTO currslot slot FROM puzzle WHERE id = myrec.id;
/* Find the slot that is X more than our current position */
SELECT INTO y slot FROM puzzle WHERE slot >= currslot
ORDER BY slot LIMIT 1 OFFSET myrec.val;
IF y IS NULL THEN
/* No slot found, so we fell off the right end. Circle to the front */
SELECT INTO slotcount count(*) FROM puzzle WHERE slot > currslot;
myrec.val = myrec.val - slotcount;
/* Grab our new left and right boundaries */
SELECT INTO y slot FROM puzzle WHERE slot <> currslot
ORDER BY slot ASC LIMIT 1 OFFSET (myrec.val)-1;
SELECT INTO z slot FROM puzzle WHERE slot <> currslot
ORDER BY slot ASC LIMIT 1 OFFSET (myrec.val);
ELSE
/* We found a left boundary, can we find a matching right one? */
SELECT INTO z slot FROM puzzle WHERE slot >= currslot
ORDER BY slot LIMIT 1 OFFSET myrec.val+1;
IF z IS NULL THEN
/* We ran off the right edge - simply move this to the end */
SELECT INTO slotcount max(slot) FROM puzzle;
UPDATE puzzle SET slot = slotcount+1 WHERE id = myrec.id;
CONTINUE;
END IF;
END IF;
SELECT INTO halfval (z+y)/2;
UPDATE puzzle SET slot=halfval WHERE id = myrec.id;
END LOOP;
END
$$;
Let's call it ten times in a row. Again, sometimes lazy is best. No loops, just literally run it ten times in a row. We also vacuum between each run, as we are doing a lot of updating:
vacuum puzzle; CALL sir_mix_a_slot(); /* One */
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot(); /* Five */
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot();
vacuum puzzle; CALL sir_mix_a_slot(); /* Ten! */
As before, we'll create a giant table we can OFFSET into without worrying about wrapping:
DROP TABLE IF EXISTS bigpuzzle;
CREATE TABLE bigpuzzle AS SELECT * FROM puzzle ORDER BY slot;
INSERT INTO bigpuzzle SELECT * FROM puzzle ORDER BY slot;
Then we can use our exact same CTE from above to get the solution:
WITH x AS (SELECT ctid AS c FROM bigpuzzle WHERE val=0 ORDER BY ctid ASC LIMIT 1)
,y1 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 1000)
,y2 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 2000)
,y3 AS (SELECT val FROM bigpuzzle WHERE ctid >= (SELECT c FROM x)
ORDER BY ctid LIMIT 1 OFFSET 3000)
SELECT y1.val+y2.val+y3.val AS aoc2022_day20_part2 FROM y1,y2,y3;
aoc2022_day20_part2
---------------------
14773357352059
This was one of the easier days, and as such, there is no real ASCII animation to provide this time. Only five more days to go (and yes, they get a lot harder!)
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read