Using the PostgreSQL Operator with Rook Ceph Storage
The Crunchy PostgreSQL Operator supports various forms of storage for provisioning PostgreSQL clusters in a Kubernetes environment. One such provider is Rook, which provides an abstract layer around multiple storage systems available in Kubernetes, which makes it even more convenient to choose between multiple storage engines. One storage engine that Rook supports is Ceph, which provides several types of distributed storage platforms including block-level storage, which is very helpful for expanding cloud-based workloads.
This post explores the use of the Rook storage engine with the PostgreSQL Operator, specifically demonstrating how the PostgreSQL Operator can be utilized to create a PostgreSQL cluster that is backed by Rook Ceph blockstorage.
For this example the rook-ceph-block
storage class will be created and utilized in conjunction with the PostgreSQL Operator to dynamically provision Ceph block storage for use by a PostgreSQL cluster and it's supporting services. This will effectively demonstrate how Rook can be utilized to deploy a Ceph cluster in your Kubernetes environment, therefore allowing you to leverage the power of Ceph storage, e.g. highly-available and scalable block storage, in your PostgreSQL clusters.
Many thanks to Andrew L'Ecuyer for helping with the methodology and testing that this post presents. For more information about PostgreSQL Operator storage configuration, please see the documentation.
Software Versions
The following software versions will be utilized for this example:
- PostgreSQL Operator 4.0.1
- Kubernetes v1.15
- Rook v1.0.2
- CentOS 7 (used for all cluster and client machines)
Initial Setup
This post uses a Kubernetes v1.15 cluster containing two nodes. As you can see in the following output, the cluster being utilized includes a single control-plane node named master
, and a single worker node named node01
:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 5h32m v1.15.0
node01 Ready <none> 5h27m v1.15.0
Additionally, a user with cluster-admin
privileges will be needed for running the various kubectl
commands below. For this post the kubernetes-admin
user will be utilized, as can be seen by checking the context currently set for kubectl
:
$ kubectl config current-context
kubernetes-admin@kubernetes
Installing Rook
In order to utilize Rook with the PostgreSQL Operator, it is necessary to install the Rook operator in your Kubernetes cluster, which will be used to deploy a Ceph cluster. While Rook supports additional storage providers for provisioning storage in Kubernetes, this example will demonstrate the use of the ceph
storage provider, specifically to provision block storage for use by any PostgreSQL clusters created using the PostgreSQL Operator.
In order to install Rook according to the steps below, it is first necessary to clone the Rook github repository and checkout v1.0.2 of the Rook project:
git clone https://github.com/rook/rook.git
cd rook
git checkout v1.0.2
This repository includes files that define the various Kubernetes resources needed to deploy a Rook cluster to a Kubernetes cluster. For this example, the files in the following directory will be used to deploy a Ceph cluster configured for block storage: cluster/examples/kubernetes/ceph/
. We can therefore navigate to this directory for the steps that follow:
cd cluster/examples/kubernetes/ceph/
Installing the Rook Common Objects & Operator
The first step is to create the Rook common objects, i.e. the various Kubernetes resources needed to support the deployment of a Rook operator and cluster. This includes a namespace, roles, role bindings and custom resource definitions:
kubectl create -f common.yaml
With the common objects created, the Rook Operator can then be deployed using file operator.yaml
:
kubectl create -f operator.yaml
We can then verify that the Rook operator is running by ensuring that the rook-ceph-operator
, rook-ceph-agent
, and rook-discover
pods are running:
$ kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-chj4l 1/1 Running 0 84s
rook-ceph-operator-548b56f995-v4mvp 1/1 Running 0 4m5s
rook-discover-vkkvl 1/1 Running 0 84s
Creating the Rook Ceph Cluster
Next, the Ceph cluster can be created by running the command below. Please note that this is a modified version of the yaml defined in file cluster-test.yaml
in the Rook repository, with the following changes made to support the Kubernetes cluster architecture utilized for this example:
- The
placement
section defines node affinity to ensure any required Rook components are only
deployed to certain nodes. For this example these components should only be deployed to our
worker node (node01
), so we specify that only nodes with labelkubernetes.io/hostname=node01
should be utilized when scheduling Rook components for deployment. - For the
storage
section we specify that Rook should not attempt to use all nodes in the
Kubernetes cluster for storage by settinguseAllNodes
tofalse
. Additionally, we explicitly
specify thatnode01
should be used for storage by specifyingnode01
under thenodes
section.
kubectl create -f - <<EOF
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: ceph/ceph:v14.2.1-20190430
allowUnsupported: true
dataDirHostPath: /var/lib/rook
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "node01"
mon:
count: 1
allowMultiplePerNode: true
dashboard:
enabled: true
network:
hostNetwork: false
rbdMirroring:
workers: 0
storage:
useAllNodes: false
useAllDevices: false
deviceFilter:
config:
databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
journalSizeMB: "1024" # this value can be removed for environments with normal sized disks (20 GB or larger)
osdsPerDevice: "1" # this value can be overridden at the node or device level
directories:
- path: /var/lib/rook
nodes:
- name: "node01"
EOF
Once the command above has been run, you can then verify that the cluster has been created and by verifying that the proper pods are running within the rook-ceph
namespace. The following output shows the various pods in the rook-ceph
namespace after the Ceph cluster was successfully deployed for this post:
$ kubectl get pods -n rook-ceph
NAME READY STATUS RESTARTS AGE
rook-ceph-agent-mqhj8 1/1 Running 0 2m21s
rook-ceph-mgr-a-5b747d89c6-4w8q2 1/1 Running 0 48s
rook-ceph-mon-a-57d58799b7-tpbdk 1/1 Running 0 66s
rook-ceph-operator-548b56f995-2mmlf 1/1 Running 0 2m23s
rook-ceph-osd-0-76fc8d587-q97tf 1/1 Running 0 16s
rook-ceph-osd-prepare-node01-ltm2g 0/2 Completed 0 23s
rook-discover-chhvl 1/1 Running 0 2m21s
Creating the Storage Class
Lastly, a StorageClass
needs to be created that will dynamically provision PersistentVolumes
using the Ceph cluster we just created, along with a storage pool that can be utilized by that StorageClass
. This is done using file storageclass-test.yaml
, which will create a StorageClass
called rook-ceph-block
and a CephBlockPool
storage pool called replicapool
, both of which are suitable for testing our Ceph cluster:
kubectl create -f storageclass-test.yaml
Verifying & Analyzing the Ceph Cluster
With all required Rook resources now created, you can verify the health of the Ceph cluster by using the ceph
command line tool built into the various Rook pods, e.g. the Rook operator pod:
# find the Rook operator pod
$ kubectl get pod -n rook-ceph --selector=app=rook-ceph-operator
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-548b56f995-l7wtt 1/1 Running 0 15m
# view Ceph cluster health
$ kubectl exec -n rook-ceph -it rook-ceph-operator-548b56f995-l7wtt -- ceph status
cluster:
id: 9059353f-ba50-49c1-b8c9-acd334451627
health: HEALTH_OK
services:
mon: 1 daemons, quorum a (age 7m)
mgr: a(active, since 6m)
osds: 1 up (since 6m), 1 in (since 6m)
data:
pools: 1 pools, 100 pgs
objects: 0 objects, 0 B
usage: 8.0 GiB used, 31 GiB / 39 GiB avail
pgs: 100 active+clean
You can also show the free space statistics for the cluster using ceph df
:
$ kubectl exec -n rook-ceph -it rook-ceph-operator-548b56f995-l7wtt -- ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAWUSED
hdd 39 GiB 31 GiB 8.0 GiB 8.0 GiB 20.45
TOTAL 39 GiB 31 GiB 8.0 GiB 8.0 GiB 20.45
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
replicapool 1 0 B 0 0 B 0 29 GiB
As you can see, we have not yet used any storage from the replicapool
storage pool.
Installing the PostgreSQL Operator
With a Rook cluster up and running, the PostgreSQL Operator can now be deployed. For this example, the PGO will be installed using Ansible. For additional information on installing the PostgreSQL Operator, as well as any of commands, configuration, etc. discussed below, please see the PostgreSQL Operator documentation.
In order to install the PGO according to the steps below, it is first necessary to clone the PGO github repository and checkout v4.0.1 of the PGO project:
git clone https://github.com/CrunchyData/postgres-operator.git
cd postgres-operator
git checkout 4.0.1
Next we can navigate to the ansible
folder from the root of the PGO repository for the steps that follow:
cd ansible
Inventory Configuration
Before installing the PGO using Ansible, the inventory
file (located in the ansible
directory) must be configured as follows:
Specify the proper context for connecting to the Kubernetes cluster, e.g. the result of
kubectl config current-context
:kubernetes_context=kubernetes-admin@kubernetes
Specify admin credentials for the PGO:
pgo_admin_username='pgouser' pgo_admin_password='pgopass'
Specify a namespace for deploying the PGO. For this example, the PGO will be deployed to the
namespacepgo
(the installer will create this namespace if it does not already exist):pgo_operator_namespace='pgo'
Specify the namespace(s) managed by the PGO. Please note that the PGO will only be able to deploy PG clusters to the namespaces defined using this setting. For this example, the PGO will manage a single namespace called
pgouser1
(which will also be created by the installer if it does not already exist):namespace='pgouser1'
Specify that the
storage3
storage configuration should be used for all storage required for a PG cluster:backrest_storage='storage3' backup_storage='storage3' primary_storage='storage3' replica_storage='storage3'
Set
rook-ceph-block
as the storage class for thestorage3
for storage configuration. This will ensure therook-ceph-block
storage class, and therefore our Ceph cluster, is utilized by the PGO to provision any storage required for a PG cluster.storage3_access_mode='ReadWriteOnce' storage3_size='1Gi' storage3_type='dynamic' storage3_class='rook-ceph-block' storage3_fs_group=26
Please note that
storage3_size
has also been changed from1G
to1Gi
. Depending on your version of Rook and Kubernetes, you might also need to make to same change in order to ensure anyPersistentVolumeClaims
created can successfully bind to aPersistentVolume
. Please see the following issues for more details:Ensure the PGO client installation is enabled:
pgo_client_install='true' pgo_client_version='4.0.1'
Deploy the PGO
With the inventory file now configured, the Operator can be deployed as follows:
ansible-playbook -i inventory --tags=install --ask-become-pass main.yml
You can then verify that the PGO is running by verifying that all containers within the postgres-operator
pod are running:
$ kubectl get pods -n pgo
NAME READY STATUS RESTARTS AGE
postgres-operator-897cf98f8-ztqf7 3/3 Running 0 3m
Create a PostgreSQL Cluster With Rook Ceph Storage
With the PostgreSQL Operator now deployed, it is now possible to provision a new PostgreSQL cluster that utilizes the Rook Ceph Cluster currently deployed to the Kubernetes cluster. However, in order to use the PGO client a few local environment variables must first be set:
export PGO_NAMESPACE=pgouser1
export PGOUSER="${HOME?}/.pgo/pgo/pgouser"
export PGO_CA_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_KEY="${HOME?}/.pgo/pgo/client.pem"
export PGO_APISERVER_URL='https://127.0.0.1:8443'
You will notice that the PGO_APISERVER_URL
variable specifies the loopback address (127.0.0.1
). This is because by default, the postgres-operator
service utilized by the PGO API server is of type ClusterIP
, which means the service is only accessible from within the Kubernetes cluster. In order to access the API server from outside of the cluster (i.e. from a client machine), it is necessary to use the kubectl port-forward
command to forward a local port to a pod (among other available options for exposing a service, which are out-of-scope for this example). Therefore, for this example we will simply forward local port 8443 to port 8443 in the postgres-operator
pod, allowing it to be accessed using localhost
:
kubectl port-forward $(kubectl get pod -n pgo -o name) -n pgo 8443:8443
Once the port is being forwarded, the connection to the PGO API server can be tested as follows:
$ pgo version
pgo client version 4.0.1
pgo-apiserver version 4.0.1
Create a PostgreSQL Cluster
With the proper pgo
client configuration now in place, a PG cluster can now be created with the following command:
pgo create cluster rookcluster1
You can verify then verify that the cluster is running by viewing the pods in the cluster:
$ kubectl get pods -n pgouser1
NAME READY STATUS RESTARTS AGE
rookcluster1-7b75b56455-rst5x 1/1 Running 0 3m3s
rookcluster1-backrest-shared-repo-7f5cd86c56-2fbnf 1/1 Running 0 3m3s
rookcluster1-stanza-create-szpqx 0/1 Completed 0 82s
As can be seen above, by default the PostgreSQL Operator creates a pod containing the PG database, as well as a pod for a pgBackRest
repository. Additionally, a job is run to perform the initial stanza creation for the pgBackRest
repository. Please note that pgBackRest
is the default backup and restore solution for any PostgreSQL cluster created using the PostgreSQL Operator as of version 4.0.1, although other backup and restore solutions, e.g. pg_basebackup
, pg_dump
and pg_restore
, are available as well.
Looking at the PersistentVolumeClaims
(PVC's) created for use by these various pods, you can see that two PVC's were created:
rookcluster1
- the PVC for the PostgreSQL database itselfrookcluster1-backrest-shared-repo
- the PVC for thepgBackRest
repository (used to store WAL
archives and backups)
$ kubectl get pvc -n pgouser1
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rookcluster1 Bound pvc-83a54676-6fe5-48f4-af97-860e690c1a23 1Gi RWO rook-ceph-block 4m19s
rookcluster1-backrest-shared-repo Bound pvc-4d88f9df-a62c-49a7-ad81-69dc0546e9f4 1Gi RWO rook-ceph-block 4m16s
And each of these PVC's are bound to PersistentVolumes
(PV's) provisioned using storage class rook-ceph-block
, with the PV's providing the Ceph block storage required to run the PG cluster and its associated pods:
$ kubectl get pv -n pgouser1
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-4d88f9df-a62c-49a7-ad81-69dc0546e9f4 1Gi RWO Delete Bound pgouser1/rookcluster1-backrest-shared-repo rook-ceph-block 6m51s
pvc-83a54676-6fe5-48f4-af97-860e690c1a23 1Gi RWO Delete Bound pgouser1/rookcluster1 rook-ceph-block 6m54s
Backup the PostgreSQL Cluster
At this point, the PostgreSQL cluster has been successfully deployed that is fully backed by Ceph block storage. We can now perform a few additional actions to demonstrate the provisioning of additional Ceph block storage using the rook-ceph
CephCluster
. For instance, we can create a pg_basebackup
of the cluster the pgo backup
command:
$ # backup the cluster
$ pgo backup rookcluster1 --backup-type=pgbasebackup
created backup Job for rookcluster1
workflow id 935acbba-a233-4dfb-a7a5-85a602fd9098
$ # wait for the backup job to complete
$ kubectl get job -n pgouser1 --selector=pgbackup=true
NAME COMPLETIONS DURATION AGE
backup-rookcluster1-ndml 1/1 42s 3m5s
This will create a new PVC named rookcluster1-backup
, which will be bound to a new PV (again provisioned from the Ceph cluster) that will be used to store the backup:
$ kubectl get pvc -n pgouser1 rookcluster1-backup
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rookcluster1-backup Bound pvc-1c74042a-81cd-49cf-83ad-84e644546842 1Gi RWO rook-ceph-block 7m
Restore the PostgreSQL Cluster
Using the rookcluster1-backup
PVC created when creating the pg_basebackup
above, a restored database can now be created using the pgo restore
command. This command will again create a new PVC, this time to store the restored database. Additionally, the PVC will be named rookcluster1-restored
, as specified by the --restore-to-pvc
flag:
$ # restore the cluster
$ pgo restore rookcluster1 --backup-type=pgbasebackup --backup-pvc=rookcluster1-backup --restore-to-pvc=rookcluster1-restored
$ # wait for the restore job to complete
$ kubectl get job -n pgouser1 --selector=pgo-pgbasebackup-restore=true
NAME COMPLETIONS DURATION AGE
pgbasebackup-restore-rookcluster1-jncm 1/1 24s 6m26s
$ # view the pvc used to store the restored db
$ kubectl get pvc -n pgouser1 rookcluster1-restored
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGE CLASS AGE
rookcluster1-restored Bound pvc-7afa9bbf-f634-44b7-abc4-8ac70c832051 1Gi RWO rook-ceph-block 7m48s
Finally, a new cluster can be created using this PVC, effectively creating the restored database. Please ensure the name of the new cluster matches the name of the PVC containing the restored database (rookcluster1-restored
) when running the pgo create
command:
pgo create cluster rookcluster1-restored --pgbackrest=false
Ceph Disk Stats
At this point we have two PG clusters running, both the original cluster (rookcluster1
) and the restored cluster (rookcluster1-restored
), and we also still have the pg_basebackup
. Therefore, we should now expect to see storage consumed in the Ceph cluster when when we view the free space statistics for the cluster:
$ kubectl exec -it rook-ceph-operator-548b56f995-l7wtt -n rook-ceph -- ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 39 GiB 29 GiB 10 GiB 10 GiB 25.45
TOTAL 39 GiB 29 GiB 10 GiB 10 GiB 25.45
POOLS:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
replicapool 1 274 MiB 124 274 MiB 0.97 27 GiB
As the above output shows, due to the various PVC's created above for PG databases, backups and restores, storage has indeed been consumed in the replicapool
storage pool for our rook-ceph CephCluster
.
Conclusion
In conclusion, this post demonstrates how the Crunchy PostgreSQL Operator can leverage Rook to provide an effective storage solution to meet the storage requirements for a PostgreSQL cluster. Specifically, the example above shows how the PostgreSQL Operator to can use Rook to leverage the power of Ceph to provide storage for all aspects of running and managing a PG cluster, whether it be storage for PG database itself, or storage for supporting activities and services such as backups, restores and WAL archiving.
Related Articles
- Crunchy Data Warehouse: Postgres with Iceberg for High Performance Analytics
8 min read
- Loading the World! OpenStreetMap Import In Under 4 Hours
6 min read
- Easy Totals and Subtotals in Postgres with Rollup and Cube
5 min read
- A change to ResultRelInfo - A Near Miss with Postgres 17.1
8 min read
- Accessing Large Language Models from PostgreSQL
5 min read