5 Ways to Get Table Creation Information in Postgres
A question I hear from time to time with Crunchy Data clients and the Postgres community is:
When was my Postgres database table created?
Postgres does not store the creation date of tables, or any other database object. But fear not, there are a plethora of direct and indirect ways to find out when your table creation happened. Let's go through some ways to do this, ranging from easy to somewhat hard. All these solutions apply to indexes and other database objects, but tables are by far the most common request.
1. Logging
The easiest solution is to look in your Postgres logs and see exactly when the CREATE TABLE
command ran. Unfortunately, this does require that the log_statement
parameter be set before you create the table! Changing log_statement
to ddl
is a great idea already, and has many benefits other than seeing object creation times. Once you have it set, Postgres will log nearly everything you do to a table that is not a SELECT
, INSERT
, UPDATE
, or DELETE
. So yes, that covers CREATE TABLE
as well as ALTER TABLE
and DROP TABLE
. Of course, this also requires that your log_line_prefix
includes a timestamp, but that is already enabled by default. Let's see what it looks like in action:
## Change our logging, then ask Postgres to reload the configuration files:
$ psql -c 'alter system set log_statement = ddl' -c 'select pg_reload_conf()'
## Create some test tables with the pgbench program:
$ pgbench -i
## From the current logfile, grab all the lines, split them into rows,
## use a regular expression to find all the CREATE TABLE statements,
## and pull back the latest four
$ psql -Atc 'select * from regexp_split_to_table( pg_read_file( pg_current_logfile() ), chr(10) )
as x where x ~* $$create table$$' | tail -4 | cut -c1-90
2017-03-03 16:17:10 EDT [13] LOG: statement: create table pgbench_history(tid int
2017-03-03 16:17:10 EDT [13] LOG: statement: create table pgbench_tellers(tid int
2017-03-03 16:17:10 EDT [13] LOG: statement: create table pgbench_accounts(aid int
2017-03-03 16:17:10 EDT [13] LOG: statement: create table pgbench_branches(bid int
2. Operating system file
If you are not logging create statements, you may need more indirect methods. When you create a table in Postgres, it creates a file on disk representing that table. So it becomes possible to see the file creation time as a proxy for the creation time in the database. Be warned that there are a few actions that will rewrite the entire table, which will create a completely new file on disk. The two most common are the VACUUM FULL
and CLUSTER
commands, actions that are rarely, if ever, done. So how can we view the table creation time on disk? Perhaps we can use some of the built in file tools Postgres has?
$ psql -c 'CREATE TABLE daruk(id int)'
$ psql -xc "select * from pg_stat_file( pg_relation_filepath( 'daruk' ) )"
-[ RECORD 1 ]+-----------------------
size | 0
access | 2017-03-03 16:56:40-04
modification | 2017-03-03 16:56:40-04
change | 2017-03-03 16:56:40-04
creation | ☃
isdir | f
## Verify the above by checking the actual timestamp:
$ psql -c 'select * from regexp_split_to_table( pg_read_file(
pg_current_logfile()),chr(10)) as x where x ~* $$daruk$$'
x
--------------------------------------------------------------------------
2017-03-03 16:56:40 EDT [146] LOG: statement: CREATE TABLE daruk(id int)
That appears to work, as everything agrees on "2017-03-03 16:56:40", but unfortunately, those times returned by pg_stat_file
are not useful. Let's see what happens when we make changes to the table:
$ psql -c 'insert into daruk select 123' -c checkpoint
$ psql -xc "select * from pg_stat_file( pg_relation_filepath( 'daruk' ) )"
-[ RECORD 1 ]+-----------------------
size | 8192
access | 2017-03-03 16:56:40-04
modification | 2017-03-03 17:00:19-04
change | 2017-03-03 17:00:19-04
creation | ☃
isdir | f
As expected, the modification
and change
fields have changed, but the access
one has not. This is because Postgres has already opened it. If we were to force Postgres to reopen the file (e.g. by restarting the server), the field will change and no longer be a useful proxy for the table creation time:
$ pg_ctl -D $(psql -Atc 'show data_directory') restart
## Still the same access time, because Postgres has not looked at it yet:
$ psql -xc "select access,modification,change from pg_stat_file(pg_relation_filepath('daruk'))"
-[ RECORD 1 ]+-----------------------
access | 2017-03-03 16:56:40-04
modification | 2017-03-03 17:00:19-04
change | 2017-03-03 17:00:19-04
## This causes Postgres to open the file and load it into shared buffers:
$ psql -c 'explain analyze select id from daruk'
$ psql -xc "select access,modification,change from pg_stat_file(pg_relation_filepath('daruk'))"
-[ RECORD 1 ]+-----------------------
access | 2017-03-03 17:05:14-04
modification | 2017-03-03 17:04:01-04
change | 2017-03-03 17:04:01-04
The only thing we know here is that the table creation happened no later than March 3 at 17:04 local time. So what can we do? Well, while pg_stat_file()
reads some of the metadata for the underlying file, it does not read all the metadata. We can ask the operating system about the file and get its "birth date":
$ cd $( psql -Atc 'show data_directory' )
## Check out the "Birth" field:
$ stat $( psql -Atc "select pg_relation_filepath('daruk')" )
File: base/5/1034735
Size: 8192 Blocks: 16 IO Block: 4096 regular file
Device: 820h/2080d Inode: 366999 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ greg) Gid: ( 1000/ greg)
Access: 2017-03-03 17:05:14.946563872 -0400
Modify: 2017-03-03 17:04:01.726563832 -0400
Change: 2017-03-03 17:04:01.726563832 -0400
Birth: 2017-03-03 16:56:40.926567946 -0400
What if we suspect the table has been fully vacuumed, and thus rewritten on disk? One way is to look at what other objects the database created around the same time. If you know that those have not changed, then you can be sure that the timestamp indicated by "Birth" above is accurate:
## Create four tables, urbosa_1, urbosa_2, etc.
$ for x in 1 2 3 4; do psql -c "create table urbosa_$x(id int)"; done
$ cd $( psql -Atc 'show data_directory' )
$ for x in 1 2 3 4; do stat `psql -Atc "select pg_relation_filepath('urbosa_$x')"`
done | grep Birth
Birth: 2023-05-12 21:49:35.523869139 -0400
Birth: 2023-05-12 21:49:35.543869139 -0400
Birth: 2023-05-12 21:49:35.563869139 -0400
Birth: 2023-05-12 21:49:35.593869139 -0400
## Wait a few hours, then force one of them to get recreated on disk:
$ psql -c 'vacuum full urbosa_2'
$ for x in 1 2 3 4; do stat `psql -Atc "select pg_relation_filepath('urbosa_$x')"`;
done | grep Birth
Birth: 2023-05-12 21:49:35.523869139 -0400
Birth: 2023-05-13 01:18:24.867530999 -0400
Birth: 2023-05-12 21:49:35.563869139 -0400
Birth: 2023-05-12 21:49:35.593869139 -0400
## One of those is no longer like the others!
It may be that we don't know the names of the tables in advance. We can also check the directory which stores the tables and find all objects created around the same time as our target table:
$ ls -r --time=birth /home/greg/pg/15/data/base/5 \
| grep -B4 -A4 `psql -Atc "select pg_relation_filenode('urbosa_3')"` \
| grep -E '^[0-9]+$' \
| xargs -IZ psql -Atc "select pg_filenode_relation(0,Z)"
pgbench_accounts_pkey
daruk
urbosa_1
urbosa_3
urbosa_4
revali
dbd_pg_testschema.dbd_pg_testsequence
pg_toast.pg_toast_1060929_index
pg_toast.pg_toast_1060929
urbosa_2
3. System catalogs
Scenario: you have a bunch of tables, but you do not know the creation time of all of them. But you do have information on other tables created before or after your table of interest. All tables have been recreated via VACUUM FULL
on the database. We can use information in the system catalogs to see when the files were created relative to other ones. While a full vacuum changes the file on disk, things like the pg_type table, and the oid column of pg_class remain the same. Let's create five tables, shuffle them up, and use the pg_class.oid
column to restore the original creation order:
## Create five tables in a specific order (yes, a table can have zero columns!)
$ for x in 5 4 3 2 1 ; do psql -c "create table hestu_$x()"; done
## Rewrite them and shuffle up their order:
$ for x in 3 5 2 1 4 ; do psql -c "vacuum full hestu_$x"; done
## Ordering by relfilenode shows the order most recently recreated:
$ psql -c "select relfilenode, oid, relname from pg_class where relname ~ 'hestu' order by 1"
relfilenode | oid | relname
-------------+---------+---------
1061009 | 1061000 | hestu_3
1061012 | 1060994 | hestu_5
1061015 | 1061003 | hestu_2
1061018 | 1061006 | hestu_1
1061021 | 1060997 | hestu_4
(5 rows)
## Ordering by the oid column reveals the original creation order:
$ psql -c "select relfilenode, oid, relname from pg_class where relname ~ 'hestu' order by 2"
relfilenode | oid | relname
-------------+---------+---------
1061012 | 1060994 | hestu_5
1061021 | 1060997 | hestu_4
1061009 | 1061000 | hestu_3
1061015 | 1061003 | hestu_2
1061018 | 1061006 | hestu_1
(5 rows)
4. Parsing WAL files
As the Write Ahead Log (aka WAL) contains a complete log of every change made to the database, we can scan it to determine the approximate table creation time. Let's say it has been about a week since you created a table named "sales". While it has undergone many rewrites via vacuum full, you need to determine its original creation time.
The first step is to find the WAL files in question, based on their timestamps. WAL files get created, written, and then never changed again, so the modification timestamps on them are immutable. If the WAL file is no longer around because it has been removed/recycled, you may need to grab them from your pgBackRest archive directory (such an action is beyond the scope of this article, but once copied/unzipped, the process is the same).
Once you have the WAL files, you can use the pg_waldump program to view the information in the WAL stream, and grep for the relfilenode
- the name of the file on disk storing that table. After finding the right one, you can see the WAL creation and last modification times to get a ballpark estimate of the table's creation time.
# To find a table named 'daruk', first find and store its filenode:
$ FILENODE=`psql -Atc "select pg_relation_filenode('daruk')"`; echo $FILENODE
1061087
# Find the place where WAL is stored:
$ WALDIR=`psql -Atc 'show data_directory'`/pg_wal; echo $WALDIR
/home/greg/pg/15/data/pg_wal
# Scan all the WAL files to find a WAL file referencing that table
$ find $WALDIR -size 16M -exec sh -c "pg_waldump {} 2>/dev/null | grep 'CREATE.*/$FILENODE' && stat {} " \;
rmgr: Storage len (rec/tot): 42/ 42, tx: 0, lsn: 15/1AA89C28, prev 15/1AA89BF0, desc: CREATE base/5/1061087
File: /home/greg/pg/15/data/pg_wal/00000001000000150000001A
Size: 16777216 Blocks: 32768 IO Block: 4096 regular file
Device: 820h/2080d Inode: 1008177 Links: 1
Access: (0600/-rw-------) Uid: ( 1000/ greg) Gid: ( 1000/ greg)
Access: 2023-05-12 10:47:31.931105494 -0400
Modify: 2023-05-12 10:46:33.371107499 -0400
Change: 2023-05-12 10:46:33.371107499 -0400
Birth: 2023-05-12 05:45:33.941842312 -0400
## Do the same, but look at the timestamps in the WAL itself:
$ find $WALDIR -size 16M -exec sh -c "pg_waldump {} 2>/dev/null | grep -E '(CREATE|COMMIT).*$FILENODE' " \;
rmgr: Storage len (rec/tot): 42/ 42, tx: 0, lsn: 15/1B0000F8, prev 15/1B0000D8, desc: CREATE base/5/1061087
rmgr: Transaction len (rec/tot): 373/ 373, tx: 565623, lsn: 15/1B0151C8, prev 15/1B015198, desc: COMMIT 2023-05-12 00:59:25.01634 EDT; inval msgs: catcache 80 catcache 79 ... snapshot 2608 relcache 1061087
5. Backups
If you have a good backup system, like pgBackRest, you can look at the saved files to find out the approximate file creation time. It is possible that the table creation occurred before your earliest backup, but if the underlying file does NOT exist in a particular full backup, and then exists in a subsequent backup, you can narrow down the creation time quite a bit. First, you will need to know the relfilenode
of the table in question:
$ psql -c "SELECT relname, relfilenode FROM pg_class WHERE relname = 'mipha'"
OR:
$ psql -c "SELECT pg_relation_filenode('mipha')"
pg_relation_filenode
----------------------
1055179
This is the file on disk inside the backups, underneath the base
directory. pgBackRest will compress it by default, but it will have the same name, so we can grep our pgBackRest repository:
$ pgbackrest info --stanza=mydemo | grep backup:
full backup: 20180517-123005F
full backup: 20180518-123005F
incr backup: 20180518-123005F_20180518-163002I
incr backup: 20180518-123005F_20180519-020002I
incr backup: 20180518-123005F_20180518-080003I
full backup: 20180519-123005F
$ find /var/lib/pgbackrest -printf "%c %p\n" | grep 1055132
Fri May 18 20:42:33 2018 /var/lib/pgbackrest/backup/mydemo/20180518-123005F/pg_data/base/5/1055132.gz
Sat May 19 20:33:56 2018 /var/lib/pgbackrest/backup/mydemo/20180518-123005F_20180518-163002I/pg_data/base/5/1055132.gz
We can tell from the above that the first time we saw the file appear was in the 20180518-123005F
backup, which the name tells us is a full (complete) backup from May 18, 2018 at 12:30. It also appeared in a subsequent incremental backup, which means that at some point the file changed (e.g. via insert/update/delete of rows). We also note that it did NOT appear in the 20180517-123005F
backup. Note that this is not foolproof, as a VACUUM FULL could have changed the relfilenode.
Final ways to tell the table creation time
There are other methods to find the table creation time, but we listed some of the easiest, common ones. For example, if you have log_autovacuum
set, you may spot an entry in your Postgres logs referencing when the table was first autovacuumed. The data inside the table itself may provide a clue, or your application log might reveal it as well.
For the easiest Postgres troubleshooting, make sure you are logging all of your DDL additions, modification, or all of your actions with one of these:
log_statement = 'ddl'
log_satetment = 'mod'
log_statement = 'all'
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read