How to Perform a Major Version Upgrade Using pg_upgrade in PostgreSQL
Odds are you've been tasked with upgrading software from one major release to another at some point. Trust me, I understand how cumbersome and difficult these upgrades can be! Luckily, Crunchy Data has some tested methods to help get you upgraded with the least amount of headache possible! For this use case, we’ll be using pg_upgrade. Let’s get started!
This part is critical to a successful and healthy upgrade: read the release notes. Sometimes, even within minor upgrades, additional steps may be required and you won’t know if this is the case unless you read the release notes or risk running into an issue during the upgrade. Usually, you won’t have any extra steps to perform, but sometimes things like rebuilding indexes or changing paths may be needed. Again, be sure to read the release notes to avoid issues when implementing the upgrade.
In this instance, we’ll be using pg_upgrade to upgrade from Crunchy Certified PostgreSQL 11 to Crunchy Certified PostgreSQL 12 on CentOS 7. I choose pg_upgrade due to the upgrade speed. It’s literally done in minutes most of the time. If you’ve already tried to guess where the pg_upgrade binary is located and guessed /usr/pgsql-##/bin, then congratulations! You’re right! pg_upgrade can be found in the same default location that you’ll find initdb, pg_ctl, etc in.
After the appropriate release notes have been read thoroughly, it’s time to get upgrading!
Before we get underway with the actual pg_upgrade command and implementation, we should take a backup using our favorite backup tool. As a pgBackRest advocate, I would suggest using pgBackRest. Some details on getting pgBackRest setup can be found here.
Now that a backup has been taken, we'll install the PostgreSQL 12 binaries.
yum -y install postgresql12-server postgresql12-contrib
Let us initialize the new cluster as the postgres user.
/usr/pgsql-12/bin/initdb -D /var/lib/pgsql/12/data
Stop all connections from the application (or elsewhere) to the database. To do so, become the postgres user and run the following command; Note: if you’ve changed the location of the data directory or binaries for PostgreSQL, you’ll need to substitute the paths below for your custom paths.
/usr/pgsql-11/bin/pg_ctl -D /var/lib/pgsql/11/data/ -mf stop
Once the new cluster is initialized, and the old cluster is halted we’ll now run the pg_upgrade command with the -c (check) flag.
time /usr/pgsql-12/bin/pg_upgrade --old-bindir /usr/pgsql-11/bin --new-bindir /usr/pgsql-12/bin --old-datadir /var/lib/pgsql/11/data --new-datadir /var/lib/pgsql/12/data --link --check
Some information regarding the flags used above:
-b (--old-bindir=bindir)
is the old cluster executable directory
-B (--new-bindir=bindir)
is the new cluster executable directory
-d (--old-datadir=configdir)
is the old cluster data directory
-D (--new-datadir=configdir)
is the new cluster data directory
-k (--link)links
instead of copying files to the new cluster NOTE: remove this option if you need to keep a local copy of the old cluster's data files. This will increase the time it takes the upgrade to run as well as the size of the cluster on disk. You can NOT go back after using -k
. Please be sure that you have enough disk space for 2 whole copies of the cluster if you do NOT use the -k
flag.
-c (--check)
does a "check" only, and doesn't change any data. It will run through numerous consistency checks. Always run your pg_upgrade command with this first.
Additional information regarding pg_upgrade can be found in the documentation.
We’ve trained for this. Let us remove the -c (--check)
flag and run the upgrade!
time /usr/pgsql-12/bin/pg_upgrade --old-bindir /usr/pgsql-11/bin --new-bindir /usr/pgsql-12/bin --old-datadir /var/lib/pgsql/11/data --new-datadir /var/lib/pgsql/12/data --link
After the above is done you should get an output congratulating your success! At this point it will be worthwhile to run the ./analyze_new_cluster.sh script in order to generate stats so the system is usable. You can also wait and vacuum will eventually generate this for you. The former option is typically the best route.
Lastly, if you wish to remove the old cluster’s data files, you can run ./delete_old_cluster.sh
as the postgres user.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read