Time Partitioning and Custom Time Intervals in Postgres with pg_partman
Whether you are managing a large table or setting up automatic archiving, time based partitioning in Postgres is incredibly powerful. pg_partman’s newest versions support a huge variety of custom time internals. Marco just published a post on using pg_partman with our new database product for doing analytics with Postgres, Crunchy Bridge for Analytics. So I thought this would be a great time to review the basic and complex options for the time based partitioning.
Time partitioning intervals
When I first started designing pg_partman for time-based partitioning, it only had preset intervals that users could choose. Currently, pg_partman supports all of Postgres’ time base interval values. The partitioning interval is set during the initial parent creation, in the p_interval
field.
SELECT partman.create_parent(
p_parent_table := 'partman_test.time_taptest_table'
, p_control := 'col3'
, p_interval := '1 day'
, p_template_table := 'partman_test.time_taptest_table_template'
);
Additional examples like:
p_interval := '1 month'
p_interval := '1 year'
Even with allowing these intervals, some common intervals used in business, like weekly and quarterly, can be a little tricky. But thankfully pg_partman still has options to make these intervals easy and now even more flexible. So let’s dig into these examples.
Weekly partitioning
Weekly partitioning was and still is a fairly popular partitioning interval. When I started working on it, I’d thankfully found the ISO week date standard to allow me to tackle the more difficult issues of handling weeks (leap years, starting days, 53 week years) when I’d wanted to label the children with the week number . However with declarative partitioning I found an opportunity to allow this to be more flexible when redesigning things for version 5 of partman. While the result did get rid of the nice weekly numbering pattern I had liked for this interval (IYYYwIW which came out to something like “2024w15”), the new method lets people start their week on whichever day they desired. However, with flexibility always comes a little more complexity.
When you set your partitioning interval to 1 week in pg_partman, the day that starts that weekly pattern will be whatever day of the week it is when you run create_parent()
. So today being a Wednesday when I’m writing this blog post, my partition naming pattern AND constraints for the child tables would be as follows:
CREATE TABLE time_stuff(id int GENERATED ALWAYS AS IDENTITY, created_at timestamptz NOT NULL) PARTITION BY RANGE (created_at);
SELECT partman.create_parent('public.time_stuff', 'created_at', '1 week');
create_parent
---------------
t
(1 row)
\d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240327 FOR VALUES FROM ('2024-03-27 00:00:00-04') TO ('2024-04-03 00:00:00-04'),
time_stuff_p20240403 FOR VALUES FROM ('2024-04-03 00:00:00-04') TO ('2024-04-10 00:00:00-04'),
time_stuff_p20240410 FOR VALUES FROM ('2024-04-10 00:00:00-04') TO ('2024-04-17 00:00:00-04'),
time_stuff_p20240417 FOR VALUES FROM ('2024-04-17 00:00:00-04') TO ('2024-04-24 00:00:00-04'),
time_stuff_p20240424 FOR VALUES FROM ('2024-04-24 00:00:00-04') TO ('2024-05-01 00:00:00-04'),
time_stuff_p20240501 FOR VALUES FROM ('2024-05-01 00:00:00-04') TO ('2024-05-08 00:00:00-04'),
time_stuff_p20240508 FOR VALUES FROM ('2024-05-08 00:00:00-04') TO ('2024-05-15 00:00:00-04'),
time_stuff_p20240515 FOR VALUES FROM ('2024-05-15 00:00:00-04') TO ('2024-05-22 00:00:00-04'),
time_stuff_p20240522 FOR VALUES FROM ('2024-05-22 00:00:00-04') TO ('2024-05-29 00:00:00-04'),
time_stuff_default DEFAULT
I ran these statements on Wednesday, April 24, 2024 so you can see the partition time_stuff_p20240424
with the minimum value of that same day. And each subsequent child table is 7 days later, starting on every Wednesday. So while we’ve accomplished our weekly partitioning goal, this is not a common day to start the week. There is thankfully a very easy solution with pg_partman: we tell it the date to start making partitions. Say we wanted our weeks to start on Sunday. Just pick any Sunday date that would work for child tables we’d like to have initially created
SELECT partman.create_parent('public.time_stuff', 'created_at', '1 week', p_start_partition => '2024-04-17');
create_parent
---------------
t
(1 row)
\d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240417 FOR VALUES FROM ('2024-04-17 00:00:00-04') TO ('2024-04-24 00:00:00-04'),
time_stuff_p20240424 FOR VALUES FROM ('2024-04-24 00:00:00-04') TO ('2024-05-01 00:00:00-04'),
time_stuff_p20240501 FOR VALUES FROM ('2024-05-01 00:00:00-04') TO ('2024-05-08 00:00:00-04'),
time_stuff_p20240508 FOR VALUES FROM ('2024-05-08 00:00:00-04') TO ('2024-05-15 00:00:00-04'),
time_stuff_p20240515 FOR VALUES FROM ('2024-05-15 00:00:00-04') TO ('2024-05-22 00:00:00-04'),
time_stuff_p20240522 FOR VALUES FROM ('2024-05-22 00:00:00-04') TO ('2024-05-29 00:00:00-04'),
time_stuff_default DEFAULT
We don’t have the partitions prior to our starting date created, but you can just pick an even earlier Sunday if you need more older partitions to start with. You can see that April 17, 2024 is a Sunday and every subsequent child table has its lower boundary on a Sunday as well. So using this method you can start on any day of the week you desire. And simply using the day of the lower boundary for the suffix name got rid of the complexities of trying to use week numbers that previously required ISO weeks to solve.
Quarterly partitioning
I’d always liked the idea of quarterly partitioning since it seemed to be a nice balance between larger and smaller partitioning intervals. PostgreSQL does have some limited quarterly timestamp formatting options, but if you go back and look at the partman source code for older versions, you’ll see it was way more complex than I’d expected it to be. And it pretty much locked the quarters into 4 pre-defined month blocks. With version 5.x of pg_partman, I decided to do the same as I did with weekly and simply allow any arbitrary 3 month interval people may want. So while it lost the nicer quarterly suffix pattern (YYYYq#, 2024q2), it’s now much more flexible.
The problem and solution for quarterly is the same as weekly. It’s not quite as bad of a problem in that the child lower boundaries are always rounded to the first of the month, but the quarter will default to start in the month that create_parent()
runs. So running in April 2024 results in:
keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '3 months');
create_parent
---------------
t
(1 row)
keith=# \d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20230401 FOR VALUES FROM ('2023-04-01 00:00:00-04') TO ('2023-07-01 00:00:00-04'),
time_stuff_p20230701 FOR VALUES FROM ('2023-07-01 00:00:00-04') TO ('2023-10-01 00:00:00-04'),
time_stuff_p20231001 FOR VALUES FROM ('2023-10-01 00:00:00-04') TO ('2024-01-01 00:00:00-05'),
time_stuff_p20240101 FOR VALUES FROM ('2024-01-01 00:00:00-05') TO ('2024-04-01 00:00:00-04'),
time_stuff_p20240401 FOR VALUES FROM ('2024-04-01 00:00:00-04') TO ('2024-07-01 00:00:00-04'),
time_stuff_p20240701 FOR VALUES FROM ('2024-07-01 00:00:00-04') TO ('2024-10-01 00:00:00-04'),
time_stuff_p20241001 FOR VALUES FROM ('2024-10-01 00:00:00-04') TO ('2025-01-01 00:00:00-05'),
time_stuff_p20250101 FOR VALUES FROM ('2025-01-01 00:00:00-05') TO ('2025-04-01 00:00:00-04'),
time_stuff_p20250401 FOR VALUES FROM ('2025-04-01 00:00:00-04') TO ('2025-07-01 00:00:00-04'),
time_stuff_default DEFAULT
This does start on the quarterly months people typically expect but only by coincidence. To start your quarters in whichever month you’d like, simply set the starting partition as we did with weekly. The day doesn’t really matter, just the month.
SELECT partman.create_parent('public.time_stuff', 'created_at', '3 months', p_start_partition => '2024-03-15');
create_parent
---------------
t
(1 row)
\d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240301 FOR VALUES FROM ('2024-03-01 00:00:00-05') TO ('2024-06-01 00:00:00-04'),
time_stuff_p20240601 FOR VALUES FROM ('2024-06-01 00:00:00-04') TO ('2024-09-01 00:00:00-04'),
time_stuff_p20240901 FOR VALUES FROM ('2024-09-01 00:00:00-04') TO ('2024-12-01 00:00:00-05'),
time_stuff_p20241201 FOR VALUES FROM ('2024-12-01 00:00:00-05') TO ('2025-03-01 00:00:00-05'),
time_stuff_p20250301 FOR VALUES FROM ('2025-03-01 00:00:00-05') TO ('2025-06-01 00:00:00-04'),
time_stuff_default DEFAULT
Any Arbitrary Interval
While solving for these two custom intervals isn’t too complicated, I did run into issues with allowing any arbitrary custom interval before 5.0. The issue is with how partman rounds the intervals to give the normally expected lower boundaries depending on the length of the interval: daily rounds to midnight, monthly rounds to the first of the month, etc. What if we wanted to partition by 9 week intervals and we wanted it to start on Mondays beginning with April 22, 2024?
keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '9 weeks', p_start_partition => '2024-04-22');
create_parent
---------------
t
(1 row)
keith=# \d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240401 FOR VALUES FROM ('2024-04-01 00:00:00-04') TO ('2024-06-03 00:00:00-04'),
time_stuff_p20240603 FOR VALUES FROM ('2024-06-03 00:00:00-04') TO ('2024-08-05 00:00:00-04'),
time_stuff_p20240805 FOR VALUES FROM ('2024-08-05 00:00:00-04') TO ('2024-10-07 00:00:00-04'),
time_stuff_p20241007 FOR VALUES FROM ('2024-10-07 00:00:00-04') TO ('2024-12-09 00:00:00-05'),
time_stuff_p20241209 FOR VALUES FROM ('2024-12-09 00:00:00-05') TO ('2025-02-10 00:00:00-05'),
time_stuff_default DEFAULT
That doesn’t look right! The issue here is that since the interval is less than 1 year but greater than or equal to 1 month, partman always tries to round to the first day of the month. What we want partman to do is round to the nearest week instead since our interval is based on an arbitrary amount of weeks. As I said, this was an issue before 5.0 but fixed only fairly recently thanks to a bug report from a user. This was solved in 4.6.0 by adding another option to create_parent()
.
keith=# SELECT partman.create_parent('public.time_stuff', 'created_at', '9 weeks', p_start_partition => '2024-04-22', p_date_trunc_interval => 'week');
create_parent
---------------
t
(1 row)
keith=# \d+ time_stuff
Partitioned table "public.time_stuff"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
------------+--------------------------+-----------+----------+------------------------------+---------+-------------+--------------+-------------
id | integer | | not null | generated always as identity | plain | | |
created_at | timestamp with time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Partitions: time_stuff_p20240422 FOR VALUES FROM ('2024-04-22 00:00:00-04') TO ('2024-06-24 00:00:00-04'),
time_stuff_p20240624 FOR VALUES FROM ('2024-06-24 00:00:00-04') TO ('2024-08-26 00:00:00-04'),
time_stuff_p20240826 FOR VALUES FROM ('2024-08-26 00:00:00-04') TO ('2024-10-28 00:00:00-04'),
time_stuff_p20241028 FOR VALUES FROM ('2024-10-28 00:00:00-04') TO ('2024-12-30 00:00:00-05'),
time_stuff_p20241230 FOR VALUES FROM ('2024-12-30 00:00:00-05') TO ('2025-03-03 00:00:00-05'),
time_stuff_default DEFAULT
The p_date_trunc_interval
parameter takes values that are valid for the PostgreSQL built-in function date_trunc. This tells partman how to round the boundaries to get the values you’re more likely expecting. One unfortunate thing that is unique for the weekly option here is that date_trunc('week', <timetamptz>)
always rounds to a Monday. So in this case you wouldn’t be able to have an arbitrary amount of weeks that start on a Sunday or any other day of the week.
If you cannot use more common partition intervals (daily, monthly, etc), you’ll likely have to experiment with this feature to see if it allows you do do what you need. I would personally recommend trying to stick with more common intervals if at all possible, but business requirements sometimes require the uncommon.
Conclusion
pg_partman provides comprehensive support of time based intervals to serve a wide variety of needs. Hopefully this blog post has helped to show both basic and advanced features and how to handle some more complex partitioning requirements.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read