Streaming Replication Across Kubernetes Postgres Clusters
UPDATE TO THIS CONTENT: Since releasing this article, newer versions of Crunchy Postgres for Kubernetes have additional features for streaming replication across clusters. See our post that accompanied the release: Multi-Cloud Strategies with Crunchy Postgres for Kubernetes.
Read the disclaimer above, as this content is now out of date.
A common need when running PostgreSQL in Kubernetes is to establish a standby database in a different Kubernetes cluster. In a typical configuration there is a shared storage that both Kubernetes clusters have access to. Using the pgBackRest tool, this can be S3 compatible, GCS, Azure, or NFS. You can create a standby from an initial backup from the shared storage. As WAL segments are archived, pgBackRest loads the segments to the shared storage. The standby instance then pulls archived WAL segments from the shared storage and applies them.
The challenge created in this configuration is the potential for lag based on how frequently WAL segments are archived, which is dependent on database load and/or the archive_timeout parameter. A second challenge is that once a switchover occurs the source database (old primary) must be recreated before a switch back can occur.
A solution to these challenges is to use streaming replication for the standby. For the streaming replication to function, the Postgres primary database must be exposed externally to the cluster. In this example, this can be accomplished by using the Load Balancer service type.
Creating Environments
In this article a S3 bucket is configured for the shared storage. The S3 storage is in a separate data center than the production and standby Kubernetes cluster. This is the optimal configuration to provide the highest level of data protection allowing for the loss of two complete environments with limited loss of data. When this is not possible, the second best option is to host the shared storage in the standby location.
Process overview:
- Create production Postgres cluster.
- Perform full backup to S3 via pgBackRest
- Create standby Postgres cluster.
- Enable streaming replication on standby.
The Postgres Operator Examples repository will be used to configure the environment. To follow along, fork the repository and clone the forked repository to your workstation. The postgres-operator-examples/kustomize directory will serve as the working directory for the rest of the article.
Production
To get started, the production Postgres cluster will be created. This cluster will have one pgBackRest repository that leverages an S3 compatible platform. To create the necessary manifest and supporting files, the following steps are performed:
- Copy the kustomize/s3 example to kustomize/xkube-prod and then edit the kustomize/xkube/postgres.yaml and change all occurrences of hippo-s3 to xkube.
cp -r s3 xkube-prod
- Rename the s3.conf.example file to s3.conf.
mv xkube-prod/s3.conf.example xkube-prod/s3.conf
- Edit the s3.conf and enter the S3 key and key secret. Leave the values unquoted.
[global]
repo1-s3-key=<YOUR_AWS_S3_KEY>
repo1-s3-key-secret=<YOUR_AWS_S3_KEY_SECRET>
- Edit the postgres.yaml file and specify the correct values for the s3 storage under spec.backups.pgbackrest.repos.0.s3.
repos:
- name: repo1
s3:
bucket: '<YOUR_AWS_S3_BUCKET_NAME>'
endpoint: '<YOUR_AWS_S3_ENDPOINT>'
region: '<YOUR_AWS_S3_REGION>'
- In preparation for streaming replication, the last edit that is needed for the postgres.yaml is to add the service section to the spec as shown below. The new spec.service section will expose the Postgres database externally to the Kubernetes cluster.
spec:
service:
type: LoadBalancer
- Add the standby section to the spec in preparation and set spec.standby.enabled to false since this is the production and current primary site.
spec:
standby:
enabled: false
repoName: repo1
- Change the PostgreSQL cluster name to xkube by modifying metdata.name as shown below.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: xkube
- Last edit to the postgres.yaml is to add the patroni section to spec. In this section one Postgres parameter, archive_timeout, will be set to ensure that during times when processing is to law to force a WAL segment switch, WAL segments will be archived at least every 'n' seconds. In addition, a pg_hba rule will be added to allow the replication user to connect from the standby.
spec:
patroni:
dynamicConfiguration:
postgresql:
pg_hba:
- 'hostssl replication replicator all scram-sha-256'
- 'hostssl all all all scram-sha-256'
parameters:
archive_timeout: 120
- Using kubectl, deploy the production Postgres cluster.
kubectl apply -k xkube-prod
Once the Postgres pod is to full ready state, the Operator will automatically perform a full backup. The end state should be one pod running Postgres and a completed backup pod as seen below.
kubectl get pods
NAME READY STATUS RESTARTS AGE
xkube-00-8fkl-0 2/2 Running 0 2m18s
xkube-backup-bfn4-zwc87 0/1 Completed 0 111s
Capture the external ip that has been assigned to the xkube-ha service as this will be used to establish streaming replication later in the article.
Standby
Next, the standby cluster will be created. To bootstrap the standby Postgres instance, the Operator will use the pgBackRest repo in the S3 bucket. The following steps create the standby instance using the S3 bucket as the source for WAL segments to be replayed on the standby cluster.
- Copy the xkube-prod directory to xkube-standby.
cp -r xkube-prod xkube-standby
- Set the spec.standby.enabled key to true in the xkube-standby/postgres.yaml.
standby:
enabled: true
repoName: repo1
- Using kubectl, deploy the standby Postgres cluster to the standby Kubernetes cluster.
kubectl apply -k xkube-standby
Depending on the network and storage capabilities, it could take several minutes to bootstrap the standby instance and for the pod to come to full ready state.
Verify Replication
With both production and standby clusters running, verify replication by creating a temporary table and populating it with some data. After two minutes (value of archive_timeout) the new table and data should be visible on the standby. To force the replication, execute 'select pg_switch_wal();' on the production instance to force a WAL segment switch and archive.
Production
kubectl exec -c database $(kubectl get pod --selector="postgres-operator.crunchydata.com/cluster=xkube,postgres-operator.crunchydata.com/role=master" -o name) -- psql -c "create table test (chkdate timestamp); insert into test values (current_timestamp)"
Standby
kubectl exec -c database $(kubectl get pod --selector="postgres-operator.crunchydata.com/cluster=xkube,postgres-operator.crunchydata.com/role=master" -o name) -- psql -c "select * from test"
As WAL segments are archived, pgBackRest copies them to the S3 bucket. At regular intervals, the standby instance performs an archive get to pull down any available archived WAL segments and replays the WAL segments in the standby instance.
Standby (Streaming Configuration)
The desired configuration for streaming replication is to leverage the default replication user _crunchyrepl
with certificate authentication. This requires that both the production and standby certificates be signed by the same certificate authority. For details on how to integrate the Operator with a central certificate manager, see the "Using Cert Manager to Deploy TLS for Postgres on Kubernetes" blog post.
To keep this example simple, a new user will be created to use for streaming replication. The following steps are followed to enable streaming replication.
- On the production cluster, create a replication user.
kubectl exec -c database $(kubectl get pod --selector="postgres-operator.crunchydata.com/cluster=xkube,postgres-operator.crunchydata.com/role=master" -o name) -- psql -c "create role replicator with login replication password 'Welcome1'"
- Exec into the database container on the standby leader pod. Once in, append the following to the /pgdata/pg13/postgresql.base.conf file. The IP address specified for the host parameter should be the external ip given to the xkube-ha service on the production Kubernetes cluster.
primary_conninfo='host=10.1.1.100 user=replicator sslmode=require password=Welcome1'
- Instruct the operator to restart the pod for the parameter change to take effect. On the standby side, execute the following to trigger a restart of the Postgres pods.
kubectl patch postgrescluster/xkube --type merge --patch '{"spec":{"metadata":{"annotations":{"restarted":"'"$(date)"'"}}}}'
Verify Streaming Replication
One the restart is complete and the standby pods are full ready state, verify streaming replication using the commands below. If streaming replication setup was successful, the new rows will show up in the standby within seconds of the insert.
Production
kubectl exec -c database $(kubectl get pod --selector="postgres-operator.crunchydata.com/cluster=xkube,postgres-operator.crunchydata.com/role=master" -o name) -- psql -c " insert into test values (current_timestamp)"
Standby
kubectl exec -c database $(kubectl get pod --selector="postgres-operator.crunchydata.com/cluster=xkube,postgres-operator.crunchydata.com/role=master" -o name) -- psql -c "select * from test"
Conclusion
Setting up streaming replication can reduce lag time between the primary and standby Postgres instances running on different Kubernetes clusters. Do note that with this approach you're manually manipulating the postgresql.base.conf. At some point Patroni, the Operator, or other processes could overwrite this file and the primary_conninfo lost. In addition, if there are multiple Postgres instances on the standby cluster, the change to the postgresql.base.conf has to be performed to all instances.
Related Articles
- Postgres Tuning & Performance for Analytics Data
19 min read
- Running an Async Web Query Queue with Procedures and pg_cron
6 min read
- Name Collision of the Year: Vector
9 min read
- Sidecar Service Meshes with Crunchy Postgres for Kubernetes
12 min read
- pg_incremental: Incremental Data Processing in Postgres
11 min read