Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

The Integer at the End of the Universe: Integer Overflow in Postgres

Avatar for Jesse Soyland

Jesse Soyland

7 min read

Integer overflow occurs when a computer program tries to store an integer but the value being stored exceeds the maximum value that can be represented by the data type being used to store it. We have helped a few Crunchy Data clients navigate this recently and wanted to write up some notes.

In Postgres, there are three integer types:

  • smallint - A 2-byte integer, -32768 to 32767
  • integer- A 4-byte integer, -2147483648 to 2147483647
  • bigint - An 8-byte integer, -9223372036854775808 to +9223372036854775807

It is not uncommon to use a 4-byte integer as a primary key when defining a new table. This can cause problems if the value to be represented is more than 4-bytes can hold. If a sequence’s limit is reached you might see an error in your logs that looks like this:

ERROR:  nextval: reached maximum value of sequence "test_id_seq" (2147483647)

Don’t Panic! We have some helpful and intelligible PostgreSQL solutions.

How do you know if you are close to overflowing an integer?

The following query will identify any auto-incrementing columns, which SEQUENCE object it owns, data types of the column and SEQUENCE object, and percent until the sequence value exceeds the sequence or column data type:

SELECT
    seqs.relname AS sequence,
    format_type(s.seqtypid, NULL) sequence_datatype,
    CONCAT(tbls.relname, '.', attrs.attname) AS owned_by,
    format_type(attrs.atttypid, atttypmod) AS column_datatype,
    pg_sequence_last_value(seqs.oid::regclass) AS last_sequence_value,
    TO_CHAR((
        CASE WHEN format_type(s.seqtypid, NULL) = 'smallint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 32767::float)
        WHEN format_type(s.seqtypid, NULL) = 'integer' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float)
        WHEN format_type(s.seqtypid, NULL) = 'bigint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float)
        END) * 100, 'fm9999999999999999999990D00%') AS sequence_percent,
    TO_CHAR((
        CASE WHEN format_type(attrs.atttypid, NULL) = 'smallint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 32767::float)
        WHEN format_type(attrs.atttypid, NULL) = 'integer' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 2147483647::float)
        WHEN format_type(attrs.atttypid, NULL) = 'bigint' THEN
            (pg_sequence_last_value(seqs.relname::regclass) / 9223372036854775807::float)
        END) * 100, 'fm9999999999999999999990D00%') AS column_percent
FROM
    pg_depend d
    JOIN pg_class AS seqs ON seqs.relkind = 'S'
        AND seqs.oid = d.objid
    JOIN pg_class AS tbls ON tbls.relkind = 'r'
        AND tbls.oid = d.refobjid
    JOIN pg_attribute AS attrs ON attrs.attrelid = d.refobjid
        AND attrs.attnum = d.refobjsubid
    JOIN pg_sequence s ON s.seqrelid = seqs.oid
WHERE
    d.deptype = 'a'
    AND d.classid = 1259;

To show this query in action, let me set up a test table with an integer primary key, where the sequence has been artificially advanced to 2 Billion:

postgres=# create table test(id serial primary key, value integer);
CREATE TABLE
postgres=# select setval('test_id_seq', 2000000000);
   setval
------------
 2000000000
(1 row)

postgres=# \d test
                            Table "public.test"
 Column |  Type   | Collation | Nullable |             Default
--------+---------+-----------+----------+----------------------------------
 id     | integer |           | not null | nextval('test_id_seq'::regclass)
 value  | integer |           |          |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

Now when running the query above to find the integer overflow percent, I can see that that the data types for both the column and the sequence are both integer, and since the sequence’s next value is 2 Billion, it is 93% through the acceptable range:

sequence   | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent
-------------+-------------------+----------+-----------------+---------------------+------------------+----------------
 test_id_seq | integer           | test.id  | integer         |          2000000001 | 93.13%           | 93.13%
(1 row)

Changing to negative number sequencing

Since the integer types in Postgres include negative numbers, a simple way to deal with integer overflow is to flip to sequencing with negative numbers. This can be done by giving the sequence a new start value of -1 and converting to a descending sequence by giving it a negative INCREMENT value:

alter sequence test_id_seq no minvalue start with -1 increment -1 restart;

If the purpose of the generated key is purely to create uniqueness, negative values are perfectly acceptable, but in some application frameworks or other use cases negative numbers may be undesirable or not work at all. In those cases we can change the field type entirely.

Keep in mind that the data type will need to be changed for any fields that reference this ID as well, or else they will also be out of bounds. Also any foreign key constraints will need to be dropped and reapplied after the both fields’ types have been updated.

Benefits of the negative number approach:

  • No change to the column structure
  • Very fast: just change the sequence start number

Drawbacks:

  • Negative numbers might not work with your application framework
  • You only buy yourself double the amount of IDs. You could be in this situation again soon

In general, this is a buy you some time approach and seen as a short term fix.

Changing to bigint

The more complete fix to your sequence exhaustion is changing to the bigint data type.

In order to change the field type of the above test table, we will first create a new ID of type bigint that will eventually replace the current id, and create a unique constraint on it:

alter table test add column id_new bigint;
CREATE UNIQUE INDEX CONCURRENTLY test_id_new ON test (id_new);

The new column will also need a new sequence of type bigint. The sequence needs to start at some point after the latest value that had been recorded.

CREATE SEQUENCE test_id_new_seq START 2147483776 AS bigint;
ALTER TABLE test ALTER COLUMN id_new SET DEFAULT nextval ('test_id_new_seq');
alter sequence test_id_new_seq owned by test.id_new;

Now new values can be added to the table, but there are two different sequences being incremented - the old and the new, ie:

postgres=# select * from test;
     id     | value |   id_new
------------+-------+------------
 2000000007 |       |
 2000000008 |       |
 2000000009 |       |
 2000000010 |       |
 2000000011 |       | 2147483776
 2000000012 |       | 2147483777
 2000000013 |       | 2147483778
 2000000014 |       | 2147483779

In a single transaction, we will drop the old ID constraint and default, rename columns, and add an invalid “not null” constraint on the new ID column:

BEGIN;
ALTER TABLE test DROP CONSTRAINT test_pkey;
ALTER TABLE test ALTER COLUMN id DROP DEFAULT;
ALTER TABLE test RENAME COLUMN id TO id_old;
ALTER TABLE test RENAME COLUMN id_new TO id;
ALTER TABLE test ALTER COLUMN id_old DROP NOT NULL;
ALTER TABLE test ADD CONSTRAINT id_not_null CHECK (id IS NOT NULL) NOT VALID;
COMMIT;

Now new IDs are being added to the table. Thanks to the NOT NULL constraint on id, new NULL values cannot be added, but since it is also NOT VALID the existing NULL values are permitted. In order to make id back into a primary key, the id_old data must be backfilled so that the constraint can be made valid. This can be done in batches, ie:

WITH unset_values AS (
    SELECT
        id_old
    FROM
        test
    WHERE
        id IS NULL
    LIMIT 1000)
UPDATE
    test
SET
    id = unset_values.id_old
FROM
    unset_values
WHERE
    unset_values.id_old = test.id_old;

Once all rows have been backfilled, the NOT NULL constraint can be validated, the UNIQUE index on id can be converted to a primary key, and finally the standalone NOT NULL constraint can be dropped:

ALTER TABLE test VALIDATE CONSTRAINT id_not_null;
ALTER TABLE test ADD CONSTRAINT test_pkey PRIMARY KEY USING INDEX test_id_new;
ALTER TABLE test DROP CONSTRAINT id_not_null;

At any point now the 4-byte id_old column can be dropped, as the bigint has taken its place:

postgres=# ALTER table test drop column id_old;
ALTER TABLE
postgres=# \d test
                              Table "public.test"
 Column |  Type   | Collation | Nullable |               Default
--------+---------+-----------+----------+--------------------------------------
 value  | integer |           |          |
 id     | bigint  |           | not null | nextval('test_id_new_seq'::regclass)
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)

The new 8-byte bigint id should be sufficient for a very, very, very long time:

sequence     | sequence_datatype | owned_by | column_datatype | last_sequence_value | sequence_percent | column_percent
-----------------+-------------------+----------+-----------------+---------------------+------------------+----------------
 test_id_new_seq | bigint            | test.id  | bigint          |          2147483788 | 0.00%            | 0.00%

Benefits of the bigint:

  • This is a long term fix and you won't have to worry about running out of sequence numbers for a very long time.

Drawbacks:

  • You probably need to update a lot of other things to larger integers
  • Takes coordination with the entire database. In our experience, this is a large project.

SERIAL types

In Postgres, the SERIAL data types (smallserial, serial, and bigserial) are shortcuts for creating auto-incrementing identifier columns whose values are assigned the next value from a Postgres SEQUENCE object.

Creating a column of type SERIAL will default to as type integer, simultaneously creating an integer sequence object owned by the specified table column and make its nextval() the default value for the column.

For new tables, consider using BIGSERIAL.

Summary

  • You can check with a query if you’re running out of sequence numbers.
  • Changing to negative numbers can be a short term fix.
  • Changing to bigint is the recommended long term fix.
  • When you are setting up a new database that’s likely to have a lot of data in it using SERIAL look at BIGSERIAL instead.

Integer overflow may appear at a glance to be insanely complicated. I have written this to keep Postgres DBAs and intergalactic travelers from panicking.