Intro to Postgres Custom Data Types
Custom data types is one of those many features that makes PostgreSQL flexible for development of a huge variety of business and application use cases. Data types will help you primarily for data integrity, ensuring your data is stored in the database as you want it to be. A common surprise gift of using strict data types is that you can end up reducing your long term data maintenance.
There’s two main ways to customize data types in Postgres:
- Create
DOMAIN
s, which are value constraints added to built-in data types - Creating user-defined custom data types
Before you go crazy with custom data types, make sure that one of the 43 existing data types won’t work for you ;). There’s so many data types and existing extensions, most people will never even need to touch custom data types. That doesn't mean it's not fun to learn about!
Hands on tutorials
My wonderful colleague, Jean-Paul Argudo wrote two hands-on tutorials for these topics, one for DOMAINS and the other for User Defined Custom Data Types.
Using CREATE DOMAIN
DOMAINs let you create a specific value check. For example if I want to make sure my birthdays are all greater than Jan 1st, 1930 and my emails are valid, I could create this:
CREATE DOMAIN date_of_birth AS date
CHECK (value > '1930-01-01'::date)
;
CREATE DOMAIN valid_email AS text
NOT NULL
CHECK (value ~* '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+[.][A-Za-z]+$')
;
When creating the table, assign the date_of_birth
and valid_email
data types to their respective columns. Now, if a database has multiple fields representing data of birth or valid email, the logic for those fields is portable to those additional columns.
CREATE TABLE person_using_domains
(
id INTEGER GENERATED always AS IDENTITY PRIMARY KEY,
firstname TEXT NOT NULL,
lastname TEXT NOT NULL,
birth_date DATE_OF_BIRTH,
email VALID_EMAIL
);
When using psql
, all domains will be listed by running \dD
.
DOMAIN vs CHECK CONSTRAINT
So you might be wondering why you would use a domain when you could just use a check constraint on the data itself? The simple answer here is that check constraints are not easily altered. They have to be dropped and re-added.
Additionally, DOMAIN
can be created at the schema level and there may be several tables with email address or birth dates. Use one DOMAIN
to control several fields, thus centralizing the logic.
Using ALTER DOMAIN
Let’s say the application has now evolved, and we only want to store persons born after 1980. You’ll run:
ALTER DOMAIN date_of_birth ADD CHECK (value > '1980-01-01'::date);
Using CREATE TYPE for Custom Data Types
Composite type
When using they CREATE TYPE
functionality, there are a few options:
- composite type - few data fields are rolled into a single type
- enumerated type - set of values
- range type - a range of values
Below, we walk through using each of these.
Composite types are used when you’re composing several different values into a single field.
If an application manages a delivery system for package delivery, then this application will use the ID of the package most of the time. In addition, characteristics of the package's dimensions and weight are something we want to store together.
Below, CREATE TYPE
will create a new data type allowing for storage of height, width, and, weight in a single field:
CREATE TYPE physical_package AS (
height numeric
, width numeric
, weight numeric
);
Then, assign that type to a table definition:
CREATE TABLE packages
(
id BIGINT GENERATED always AS IDENTITY PRIMARY KEY,
properties PHYSICAL_PACKAGE
);
Casting Data Types
To insert existing data into this field you’ll want to use a cast()
function using the ::
operator. We have a tutorial on casting as well.
INSERT INTO packages
(properties)
VALUES ( '(10.3,4.0,0.5)' :: physical_package ),
( '(5,3.0,0.2)' :: physical_package );
Querying composite data types
To access the values inside a composite data type, use dot-notation, similar to JSON: (properties).weight
. The composite data type name must be set into parenthesis when extracting values:
SELECT id, (properties).weight FROM packages;
id | weight
----+--------
1 | 0.5
2 | 0.2
Composite types vs JSON
You’re probably thinking composite types look just like JSON, and you are right. Composite types are more formally structured and JSON is notoriously less structured. Postgres is likely to be more performant for composite types. So if you care about structure and performance, this might be the solution for you. However, JSON is hugely popular for a variety of reasons so that might be more preferable for ease of future development.
Enumerated types
When you want to create a set of static values in PostgreSQL, you can use the CREATE TYPE AS ENUM
.
If we wanted a new field for package category as either box or letter, we could use:
CREATE TYPE package_cat AS ENUM ('box','letter');
Later, to add a new category:
ALTER TYPE package_cat ADD VALUE 'postcard';
Working with ENUMs
Craig Kerstiens has a great primer on enums vs check constraints and the conclusion is CHECK
constraints are a little better. Postgres allows adding values to an enum, but cannot remove a value from the enum without removing the type and re-adding.
Range types
Numerous range data types exist in PostgreSQL. The existing range data types are limited to, integers, big integers, numerics, timestamps with or without time zone and dates. The common usage of creating a new type of range, is to have it handle other range subtypes.
In our delivery scenario, customers accept some price discounts for delays in delivery (e.g. 3 day, overnight, etc). Let’s create that new range type, composed of intervals:
CREATE TYPE delay AS RANGE (
subtype = interval
);
Now we create a new table, with previous custom data types, and this this new one:
CREATE TABLE packages_with_delay
(
id BIGINT GENERATED always AS IDENTITY PRIMARY KEY,
properties PHYSICAL_PACKAGE NOT NULL,
category PACKAGE_CAT NOT NULL,
acceptable_delay DELAY NOT NULL
);
Recent versions of Postgres have additional options for out of the box range types.
TL;DR
Postgres is awesome. With 43 existing datatypes so you probably do not need a custom one, but if you do, you can add a DOMAIN
to an existing type or create completely custom types. DOMAIN
s are awesome, perhaps better than constraints. There are custom types for composites, enums, and ranges. Composites can be helpful, but you might just use JSON instead. Enums aren’t as good as constraints. Range types can be good and are usually subtype definitions.
Experiment with the functionality in our playground and you will have a new tool in the box.
Co-authored with Jean-Paul Argudo.
Related Articles
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read
- Smarter Postgres LLM with Retrieval Augmented Generation
6 min read
- Postgres Partitioning with a Default Partition
16 min read