Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

Intro to Postgres Custom Data Types

Avatar for Elizabeth Christensen

Elizabeth Christensen

6 min read

Custom data types is one of those many features that makes PostgreSQL flexible for development of a huge variety of business and application use cases. Data types will help you primarily for data integrity, ensuring your data is stored in the database as you want it to be. A common surprise gift of using strict data types is that you can end up reducing your long term data maintenance.

There’s two main ways to customize data types in Postgres:

  • Create DOMAINs, which are value constraints added to built-in data types
  • Creating user-defined custom data types

Before you go crazy with custom data types, make sure that one of the 43 existing data types won’t work for you ;). There’s so many data types and existing extensions, most people will never even need to touch custom data types. That doesn't mean it's not fun to learn about!

Hands on tutorials

My wonderful colleague, Jean-Paul Argudo wrote two hands-on tutorials for these topics, one for DOMAINS and the other for User Defined Custom Data Types.

Using CREATE DOMAIN

DOMAINs let you create a specific value check. For example if I want to make sure my birthdays are all greater than Jan 1st, 1930 and my emails are valid, I could create this:

CREATE DOMAIN date_of_birth AS date
	CHECK (value > '1930-01-01'::date)
;
CREATE DOMAIN valid_email AS text
	NOT NULL
	CHECK (value ~* '^[A-Za-z0-9._%-]+@[A-Za-z0-9.-]+[.][A-Za-z]+$')
;

When creating the table, assign the date_of_birth and valid_email data types to their respective columns. Now, if a database has multiple fields representing data of birth or valid email, the logic for those fields is portable to those additional columns.

CREATE TABLE person_using_domains
  (
     id         INTEGER GENERATED always AS IDENTITY PRIMARY KEY,
     firstname  TEXT NOT NULL,
     lastname   TEXT NOT NULL,
     birth_date DATE_OF_BIRTH,
     email      VALID_EMAIL
  );

When using psql, all domains will be listed by running \dD.

DOMAIN vs CHECK CONSTRAINT

So you might be wondering why you would use a domain when you could just use a check constraint on the data itself? The simple answer here is that check constraints are not easily altered. They have to be dropped and re-added.

Additionally, DOMAIN can be created at the schema level and there may be several tables with email address or birth dates. Use one DOMAIN to control several fields, thus centralizing the logic.

Using ALTER DOMAIN

Let’s say the application has now evolved, and we only want to store persons born after 1980. You’ll run:

ALTER DOMAIN date_of_birth ADD CHECK (value > '1980-01-01'::date);

Using CREATE TYPE for Custom Data Types

Composite type

When using they CREATE TYPE functionality, there are a few options:

  1. composite type - few data fields are rolled into a single type
  2. enumerated type - set of values
  3. range type - a range of values

Below, we walk through using each of these.

Composite types are used when you’re composing several different values into a single field.

If an application manages a delivery system for package delivery, then this application will use the ID of the package most of the time. In addition, characteristics of the package's dimensions and weight are something we want to store together.

Below, CREATE TYPE will create a new data type allowing for storage of height, width, and, weight in a single field:

CREATE TYPE physical_package AS (
    height    numeric
  , width     numeric
  , weight    numeric
);

Then, assign that type to a table definition:

CREATE TABLE packages
  (
     id         BIGINT GENERATED always AS IDENTITY PRIMARY KEY,
     properties PHYSICAL_PACKAGE
  );

Casting Data Types

To insert existing data into this field you’ll want to use a cast() function using the :: operator. We have a tutorial on casting as well.

INSERT INTO packages
            (properties)
VALUES      ( '(10.3,4.0,0.5)' :: physical_package ),
            ( '(5,3.0,0.2)' :: physical_package );

Querying composite data types

To access the values inside a composite data type, use dot-notation, similar to JSON: (properties).weight. The composite data type name must be set into parenthesis when extracting values:

SELECT id, (properties).weight FROM packages;

 id | weight
----+--------
  1 |    0.5
  2 |    0.2

Composite types vs JSON

You’re probably thinking composite types look just like JSON, and you are right. Composite types are more formally structured and JSON is notoriously less structured. Postgres is likely to be more performant for composite types. So if you care about structure and performance, this might be the solution for you. However, JSON is hugely popular for a variety of reasons so that might be more preferable for ease of future development.

Enumerated types

When you want to create a set of static values in PostgreSQL, you can use the CREATE TYPE AS ENUM.

If we wanted a new field for package category as either box or letter, we could use:

CREATE TYPE package_cat AS ENUM ('box','letter');

Later, to add a new category:

ALTER TYPE package_cat ADD VALUE 'postcard';

Working with ENUMs

Craig Kerstiens has a great primer on enums vs check constraints and the conclusion is CHECK constraints are a little better. Postgres allows adding values to an enum, but cannot remove a value from the enum without removing the type and re-adding.

Range types

Numerous range data types exist in PostgreSQL. The existing range data types are limited to, integers, big integers, numerics, timestamps with or without time zone and dates. The common usage of creating a new type of range, is to have it handle other range subtypes.

In our delivery scenario, customers accept some price discounts for delays in delivery (e.g. 3 day, overnight, etc). Let’s create that new range type, composed of intervals:

CREATE TYPE delay AS RANGE (
	subtype = interval
);

Now we create a new table, with previous custom data types, and this this new one:

CREATE TABLE packages_with_delay
  (
     id               BIGINT GENERATED always AS IDENTITY PRIMARY KEY,
     properties       PHYSICAL_PACKAGE NOT NULL,
     category         PACKAGE_CAT NOT NULL,
     acceptable_delay DELAY NOT NULL
  );

Recent versions of Postgres have additional options for out of the box range types.

TL;DR

Postgres is awesome. With 43 existing datatypes so you probably do not need a custom one, but if you do, you can add a DOMAIN to an existing type or create completely custom types. DOMAINs are awesome, perhaps better than constraints. There are custom types for composites, enums, and ranges. Composites can be helpful, but you might just use JSON instead. Enums aren’t as good as constraints. Range types can be good and are usually subtype definitions.

Experiment with the functionality in our playground and you will have a new tool in the box.

Co-authored with Jean-Paul Argudo.