Introducing Crunchy Data Warehouse: A next-generation Postgres-native data warehouse. Crunchy Data Warehouse Learn more

How To Get Started with pgBackRest and PostgreSQL 12

Avatar for Tom Swartz

Tom Swartz

10 min read

pgBackRest is a reliable and simple to configure backup and restore solution for PostgreSQL, which provides a powerful solution for any PostgreSQL database; be it a small project, or scaled up to enterprise-level use cases.

Many powerful features are included in pgBackRest, including parallel backup and restore, local or remote operation, full, incremental, and differential backup types, backup rotation, archive expiration, backup integrity, page checksums, backup resume, streaming compression and checksums, delta restore, and much more.

With the recent release of PostgreSQL 12 (and more recently 12.1), pgBackRest also received a number of updates and changes to take advantage of the latest features of Postgres.

On October 1st 2019, pgBackRest released version 2.18 which is the first release of pgBackRest to support PostgreSQL 12. As such, any deployment using PostgreSQL 12 where pgBackRest will be used requires version 2.18 or greater. At the time of this post, the latest version of pgBackRest is version 2.19.

In the following guide, we will explore the steps involved in configuring pgBackRest on a PostgreSQL 12 database, followed by simulating a disaster where the database files have been destroyed, and restoring a backup to regain the database.

Setting up the Demo

If following this guide for tutorial purposes, it is useful to have a similar working environment to ensure that the same settings, commands, and processes are used while performing the exercise.

For this example, we will be performing the install and configuration of PostgreSQL 12.1 and pgBackRest 2.19 on CentOS 7.

For simplicity, Hashicorp's Vagrant to start and manage the Official CentOS 7 Vagrant Box is recommended.

If using Vagrant, simply run:

cd temp_work_dir
vagrant init centos/7
vagrant up
vagrant ssh

Alternately, you can set up and configure a virtual machine manually, using the following
settings:

  • 12 GB single partition ext4 disk
  • 2 GB RAM
  • 2 CPU

Install PostgreSQL and pgBackRest

There are several methods for installing PostgreSQL on a CentOS 7 server which
are detailed on the PostgreSQL wiki.

For the purposes of this guide, we will install both PostgreSQL 12 and the latest
version of pgBackRest using the PGDG Yum repository.

At the time of this writing, the latest versions of PostgreSQL was version 12.1 and
pgBackRest was version 2.19.

sudo yum -y install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
sudo yum -y install postgresql12-server postgresql12-contrib pgbackrest

Next, initialize the PostgreSQL instance with the following commands:

sudo /usr/pgsql-12/bin/postgresql-12-setup initdb
sudo systemctl enable postgresql-12.service
sudo systemctl start postgresql-12.service

The first command is only necessary to run once, and is responsible for
initializing the database in $PGDATA directory.
The second command ensures PostgreSQL will start automatically when the
operating system is started, and the third command will start the database at
the present time.

To verify PostgreSQL has started correctly, the following command will confirm:

sudo -iu postgres
psql --version psql (PostgreSQL) 12.1

Next, verify pgBackRest was installed correctly by running the following command
either as the default user, or as the postgres user:

$ pgbackrest
pgBackRest 2.19 - General help
Usage:
    pgbackrest [options] [command]

Commands:
     archive-get     Get a WAL segment from the archive.
     archive-push    Push a WAL segment to the archive.
     backup          Backup a database cluster.
     check           Check the configuration.
     expire          Expire backups that exceed retention.
     help            Get help.
     info            Retrieve information about backups.
     restore         Restore a database cluster.
     stanza-create   Create the required stanza data.
     stanza-delete   Delete a stanza.
     stanza-upgrade  Upgrade a stanza.
     start           Allow pgBackRest processes to run.
     stop            Stop pgBackRest processes from running.
     version         Get version.

 Use 'pgbackrest help [command]' for more information.

Ensuring that the basic commands return valid responses and expected versions
is generally a good practice to follow, as it ensures that the software is functioning
properly and that the versions are compatible with one another.

Configure PostgreSQL

Depending on your specific use case, you may need to configure PostgreSQL's
options to meet your specific environment needs.

Possible considerations would be:

  • Using replication or configuring a cluster
  • How your backups will be stored (using the 3-2-1 Method?)
  • Security concerns for your database

For the purposes of this exercise, we will be following a very simplistic model
which will do best in demonstrating the process, but is not ideal for production
environments.
If you wish to configure your environment to have replica PostgreSQL instances,
for example, further information may be found in these Crunchy Blog Posts.

For the simple purposes of this demonstration, we will configure PostgreSQL
with the least number of changes to make a minimum working model.

If you’re unaware of where your configuration files are on the PostgreSQL host, you can run the following:

$ sudo -iu postgres psql -U postgres -c 'SHOW config_file'
config_file
----------------------------------------
/var/lib/pgsql/12/data/postgresql.conf
(1 row)

Edit the
postgresql.conf
file with root privileges in your preferred text editor.
The following parameters will need to be defined:

listen_addresses = '*'
# Optionally, define the address as the host IP:
listen_addresses = '10.0.1.1'
password_encryption=’scram-sha-256’
archive_mode = on

A brief aside about configured PostgreSQL settings

It is always best practice to have an understanding of the configuration changes made to a database.

For the purposes of this demonstration, three options were changed from their defaults, and it is important
to know why this has happened.

listen_addresses:
While PostgreSQL's pg_hba.conf is the file responsible for restricting
connections, when listen_addresses is set to * (wildcard), it is possible
to discover the open port on 5432 using nmap and learn the database
exists, thereby possibly opening the server up for an exploit. Setting it
to the an IP address prevents PostgreSQL from listening on an unintended
interface, preventing this potential exploit. More information on this
specific attack vector and how to avoid it can be found in this blog
post.

password_encryption:
Starting with the release of PostgreSQL 10, SCRAM-SHA-256 authentication was available for use, and
is used in this example for the explicit purpose of encouraging secure connections to the database.
Specifically, this method of authentication prevents password sniffing on
untrusted connections and offers support for cryptographically hashing
passwords on the server in a secure manner.
More detailed information on this authentication method can be found
here.

archive_mode:
Another addition to recent versions of PostgreSQL, starting with the
major release for PostgreSQL 10, a change was introduced to reduce the
number of configuration edits that were necessary to perform streaming
backup and replication (specifically affecting the parameters wal_level,
max_wal_senders, max_replication_slots, and hot_standby - these are now
all set by default). The release notes regarding this change can be found
here.


Once the changes have been made to the PostgreSQL configuration file, restart
the service to allow the changes to take effect.
A restart is necessary in this specific case, as all of the values changed
particularly require the PostgreSQL service to stop and start again.

sudo systemctl restart postgresql-12.service

It is possible to check if the configuration values have been correctly
applied to the database by running the following command:

sudo -iu postgres psql
SELECT name,setting,context,source FROM pg_settings WHERE NAME IN ('listen_addresses','archive_mode','password_encryption');
      name           |  setting          |    context   |    source
---------------------+---------------+------------+--------------------
archive_mode         |   on              |   postmaster | configuration file
listen_addresses     |   *               | postmaster   | configuration file
password_encryption  | scram-sha-256     | user         | configuration file
(3 rows)

Use \q to exit from the psql prompt.

Configure pgBackRest

Configure a location for the pgBackRest backup repository:

sudo mkdir -p /var/lib/pgbackrest
sudo chmod 0750 /var/lib/pgbackrest
sudo chown -R postgres:postgres /var/lib/pgbackrest

Configure the location and permissions on the pgbackrest log location:

sudo chown -R postgres:postgres /var/log/pgbackrest

Next, modify pgBackRest's configuration files to meet the needs of the environment.

As best practice, first create a backup of any existing pgbackrest.conf file:

sudo cp /etc/pgbackrest.conf /etc/pgbackrest.conf.backup

Generate a secure, long, and random passphrase to encrypt the repository:

openssl rand -base64 48

This generated value will be used as the repo1-cipher-pass option.
NOTE: Once the repository has been configured and the stanza created and
checked, the repository encryption settings cannot be changed.

Next, edit the pgbackrest.conf file as root, entering the following parameters:

[global]
repo1-cipher-pass=uUQsaa7+CCFaqXVagFzNUix3XuLe9e2uqVskqfI6wcKf8BX8y5b+8bL3oimRpV1N
repo1-cipher-type=aes-256-cbc
repo1-path=/var/lib/pgbackrest
repo1-retention-full=2
log-level-console=info
log-level-file=debug

[demo]
pg1-path=/var/lib/pgsql/12/data

The [global] section defines the location of backups, logging settings, and encryption settings.
The [demo] section defines a stanza for the demo backup repository, which we will configure.

As with the PostgreSQL settings, best practices encourage an understanding of the configuration options.
More information can be found about these configuration options within the pgBackRest Configuration Guide.

Finally, initialize the pgBackRest stanza, which contains the definitions for the
location, archiving options, backup settings, and other similar configurations
for the PostgreSQL database cluster.
There is generally one stanza defined for each database cluster that needs to have backups.
The stanza-create command must be run on the primary host after pgbackrest.conf has been configured.

$ sudo -u postgres pgbackrest --stanza=demo stanza-create

2019-11-15 18:08:57.158 P00      INFO: stanza-create command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:08:57.609 P00      INFO: stanza-create command end: completed successfully (455ms)

Pulling It All Together, Performing First Backup

Now that PostgreSQL and pgBackRest have been configured individually, a few final steps must be
performed to tie them together and perform the backup process.

First, edit the postgresql.conf file once more, and configure the archive_command:

archive_command = 'pgbackrest --stanza=demo archive-push %p'

This configuration option informs PostgreSQL to use pgBackRest to handle the WAL
segments, pushing them immediately to the archive.

Following this change to the configuration file, reload the PostgreSQL service:

sudo systemctl reload postgresql-12.service

Next, we will check the cluster with pgBackRest. This validates that pgBackRest
and the archive_command settings are both accurately configured and performing
as expected.

$ sudo -iu postgres pgbackrest --stanza=demo check
2019-11-15 18:10:03.637 P00      INFO: check command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:10:04.757 P00      INFO: WAL segment 000000010000000000000001 successfully archived to '/var/lib/pgbackrest/archive/demo/12-1/0000000100000000/000000010000000000000001-bddaecf52ba8c3dd83e6157fea6a4dbeb6476010.gz'
2019-11-15 18:10:04.757 P00      INFO: check command end: completed successfully (1120ms)

If any errors are produced by this command, read and inspect the output for
recommendations on how to resolve the specific issue.

Now, after much ado; perform a full backup:

$ sudo -u postgres pgbackrest --stanza=demo --type=full backup

2019-11-15 18:10:32.421 P00      INFO: backup command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --repo1-retention-full=2 --stanz a=demo --type=full
2019-11-15 18:10:33.555 P00      INFO: execute non-exclusive pg_start_backup() with label "pgBackRest backup started at 2019-11-15 18:10:32": backup begins after the next regular checkpoint completes 2019-11-15 18:10:33.758 P00 INFO: backup start archive = 000000010000000000000003, lsn = 0/3000028
2019-11-15 18:10:35.930 P01      INFO: backup file /var/lib/pgsql/12/data/base/14187/1255 (632KB, 2%) checksum 60325e5cd07379af0ffe91eea27cfd4f2f07af69 [...] 2019-11-15 18:10:38.818 P00 INFO: full backup size = 24.2MB 2019-11-15 18:10:38.818 P00           INFO: execute non-exclusive pg_stop_backup() and wait for all WAL segments to archive
2019-11-15 18:10:38.920 P00      INFO: backup stop archive = 000000010000000000000003, lsn = 0/3000138
2019-11-15 18:10:39.235 P00      INFO: new backup label = 20191115-181032F
2019-11-15 18:10:39.286 P00      INFO: backup command end: completed successfully (6866ms)

And finally, confirm the backup is working:

$ sudo -u postgres pgbackrest info

stanza: demo
    status: ok
    cipher: aes-256-cbc

    db (current)
        wal archive min/max (12-1): 000000010000000000000003/000000010000000000000003

        full backup: 20191115-181032F
            timestamp start/stop: 2019-11-15 18:10:32 / 2019-11-15 18:10:39
            wal start/stop: 000000010000000000000003 / 000000010000000000000003
            database size: 24.2MB, backup size: 24.2MB
            repository size: 2.9MB, repository backup size: 2.9MB

Restore a Backup

Now that a full backup is performed on a fresh database, it might be useful to test restoring from the full backup.

To do this, stop the PostgreSQL instance, and delete its data files, simulating a system administration disaster.

sudo systemctl stop postgresql-12.service
sudo find /var/lib/pgsql/12/data -mindepth 1 -delete

At this point, trying to start the database will result in a failure:

$ sudo systemctl start postgresql-12.service
## THIS WILL FAIL

Job for postgresql-12.service failed because the control process exited with error code. See "systemctl status postgresql-12.service" and "journalctl -xe" for details.

Perform a restore on the database:

sudo -iu postgres pgbackrest --stanza=demo --delta restore

Once the restore has completed, the database will start as expected:

sudo systemctl start postgresql-12.service

You can verify that pgBackRest is still working:

$ sudo -u postgres pgbackrest --stanza=demo check

2019-11-15 18:13:56.707 P00      INFO: check command begin 2.19: --log-level-console=info --log-level-file=debug --pg1-path=/var/lib/pgsql/12/data --repo1-cipher-pass=<redacted> --repo1-cipher-type=aes-256-cbc --repo1-path=/var/lib/pgbackrest --stanza=demo
2019-11-15 18:13:57.594 P00      INFO: WAL segment 000000020000000000000005 successfully archived to '/var/lib/pgbackrest/archive/demo/12-1/0000000200000000/000000020000000000000005-bd01dc079338748cd9772a7c324eed0d68d45a9c.gz'
2019-11-15 18:13:57.594 P00      INFO: check command end: completed successfully (887ms)

After any sort of disaster instance, it is always best practice to follow up any restore with a fresh backup:

sudo -u postgres pgbackrest --stanza=demo --type=full backup

In Conclusion

In conclusion, pgBackRest offers a large amount of possibilities and use-cases.
It is quite simple to install, configure, and use, simplifying Point-in-time recovery through WAL archiving.

Ensuring that backups are working and valid allows for peace of mind, should any disaster strike.