Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more
Tutorial Instructions
The main benefit of working with partitions is helping with lifecycle management of data. Big data gets really expensive so archiving off data you no longer need can be really important to managing costs. Using partitioning to manage your data lifecycle means you’ll be rolling off data to an archive frequently, allowing you to drop/archive tables easily as that data no longer needs to exist in a certain database.
Performance: Another benefit the people often look for in terms of partitioning is query performance. Especially when your queries use the indexes or partitioning keys specifically. You can really speed up query times by having things go straight to individual date ranges instead of the entire dataset.
There are quite a few different kinds of partitioning based on how you want to subdivide your data.
We have loaded sample data set for an IoT thermostat with a table containing these fields: date time, thermostat_id, current_temperature, and thermostat_status. Check it out:
SELECT * FROM thermostat LIMIT 10;
Now lets create a new table that will be partitioned
CREATE TABLE iot_thermostat (
thetime timestamptz,
sensor_id int,
current_temperature numeric (3,1),
thermostat_status text
)
PARTITION BY RANGE (thetime);
And then create an index on thetime
field
CREATE INDEX ON iot_thermostat(thetime);
CREATE TABLE iot_thermostat07232022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-23 00:00:000') TO ('2022-07-24 00:00:000');
CREATE TABLE iot_thermostat07242022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-24 00:00:000') TO ('2022-07-25 00:00:000');
CREATE TABLE iot_thermostat07252022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-25 00:00:000') TO ('2022-07-26 00:00:000');
CREATE TABLE iot_thermostat07262022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-26 00:00:000') TO ('2022-07-27 00:00:000');
CREATE TABLE iot_thermostat07272022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-27 00:00:000') TO ('2022-07-28 00:00:000');
CREATE TABLE iot_thermostat07282022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-28 00:00:000') TO ('2022-07-29 00:00:000');
CREATE TABLE iot_thermostat07292022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-29 00:00:000') TO ('2022-07-30 00:00:000');
CREATE TABLE iot_thermostat07302022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-30 00:00:000') TO ('2022-07-31 00:00:000');
CREATE TABLE iot_thermosta07312022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-07-31 00:00:000') TO ('2022-08-01 00:00:000');
CREATE TABLE iot_thermostat08012022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-08-01 00:00:000') TO ('2022-08-02 00:00:000');
CREATE TABLE iot_thermostat08022022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-08-02 00:00:000') TO ('2022-08-03 00:00:000');
CREATE TABLE iot_thermostat08032022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-08-03 00:00:000') TO ('2022-08-04 00:00:000');
Now we’ll move data from our original ‘thermostat’ dataset into the iot_thermostat and data will automatically go into the correct partitions
INSERT INTO iot_thermostat SELECT * FROM thermostat;
You just need to insert data once, Postgres will take care of moving data to the correct partitions for you.
Quick check on one of the partitions to make sure you got it all right:
SELECT * FROM iot_thermostat07242022 LIMIT 10;
Ok, so let’s say that we only care about recent data , so tomorrow we want to put data from iot_thermostat07232022 in a different table and archive it off. This is done by a DETACH
.
ALTER TABLE iot_thermostat DETACH PARTITION iot_thermostat07232022;
And that’s now a standalone table.
We need to make a new one for incoming data as well:
CREATE TABLE iot_thermostat0842022 PARTITION OF iot_thermostat
FOR VALUES FROM ('2022-08-04 00:00:000') TO ('2022-08-05 00:00:000');
If you’re doing this daily, you’ll store these in a cron job somewhere so they happen automatically.
Loading terminal...
Loading terminal...