Using the PostgreSQL Operator with Rook Ceph...

The Crunchy PostgreSQL Operator supports various forms of storage for provisioning PostgreSQL clusters in a Kubernetes environment. One such provider is Rook, which provides an abstract layer around multiple storage systems available in Kubernetes, which makes it even more convenient to choose between multiple storage engines. One storage engine that Rook supports is Ceph, which provides several types of distributed storage platforms including block-level storage, which is very helpful for expanding cloud-based workloads.

This post explores the use of the Rook storage engine with the PostgreSQL Operator, specifically demonstrating how the PostgreSQL Operator can be utilized to create a PostgreSQL cluster that is backed by Rook Ceph blockstorage.

For this example the rook-ceph-block storage class will be created and utilized in conjunction with the PostgreSQL Operator to dynamically provision Ceph block storage for use by a PostgreSQL cluster and it's supporting services. This will effectively demonstrate how Rook can be utilized to deploy a Ceph cluster in your Kubernetes environment, therefore allowing you to leverage the power of Ceph storage, e.g. highly-available and scalable block storage, in your PostgreSQL clusters.

Many thanks to Andrew L'Ecuyer for helping with the methodology and testing that this post presents. For more information about PostgreSQL Operator storage configuration, please see the documentation.

Software Versions

The following software versions will be utilized for this example:

PostgreSQL Operator 4.0.1
Kubernetes v1.15
Rook v1.0.2
CentOS 7 (used for all cluster and client machines)

Initial Setup

This post uses a Kubernetes v1.15 cluster containing two nodes. As you can see in the following output, the cluster being utilized includes a single control-plane node named master, and a single worker node named node01:

$ kubectl get nodes
NAME    STATUS    ROLES        AGE           VERSION
master  Ready     master       5h32m         v1.15.0
node01  Ready    <none>  5h27m         v1.15.0

Additionally, a user with cluster-admin privileges will be needed for running the various kubectl commands below. For this post the kubernetes-admin user will be utilized, as can be seen by checking the context currently set for kubectl:

$ kubectl config current-context
kubernetes-admin@kubernetes

Installing Rook

In order to utilize Rook with the PostgreSQL Operator, it is necessary to install the Rook operator in your Kubernetes cluster, which will be used to deploy a Ceph cluster. While Rook supports additional storage providers for provisioning storage in Kubernetes, this example will demonstrate the use of the ceph storage provider, specifically to provision block storage for use by any PostgreSQL clusters created using the PostgreSQL Operator.

In order to install Rook according to the steps below, it is first necessary to clone the Rook github repository and checkout v1.0.2 of the Rook project:

git clone https://github.com/rook/rook.git
cd rook
git checkout v1.0.2

This repository includes files that define the various Kubernetes resources needed to deploy a Rook cluster to a Kubernetes cluster. For this example, the files in the following directory will be used to deploy a Ceph cluster configured for block storage: cluster/examples/kubernetes/ceph/. We can therefore navigate to this directory for the steps that follow:

cd cluster/examples/kubernetes/ceph/

Installing the Rook Common Objects & Operator

The first step is to create the Rook common objects, i.e. the various Kubernetes resources needed to support the deployment of a Rook operator and cluster. This includes a namespace, roles, role bindings and custom resource definitions:

kubectl create -f common.yaml

With the common objects created, the Rook Operator can then be deployed using file operator.yaml:

kubectl create -f operator.yaml

We can then verify that the Rook operator is running by ensuring that the rook-ceph-operator, rook-ceph-agent, and rook-discover pods are running:

$ kubectl get pods -n rook-ceph
NAME                                   READY       STATUS        RESTARTS    AGE
rook-ceph-agent-chj4l                  1/1         Running       0           84s
rook-ceph-operator-548b56f995-v4mvp    1/1         Running       0           4m5s
rook-discover-vkkvl                    1/1         Running       0           84s

Creating the Rook Ceph Cluster

Next, the Ceph cluster can be created by running the command below. Please note that this is a modified version of the yaml defined in file cluster-test.yaml in the Rook repository, with the following changes made to support the Kubernetes cluster architecture utilized for this example:

The placement section defines node affinity to ensure any required Rook components are only
deployed to certain nodes. For this example these components should only be deployed to our
worker node (node01), so we specify that only nodes with label kubernetes.io/hostname=node01
should be utilized when scheduling Rook components for deployment.
For the storage section we specify that Rook should not attempt to use all nodes in the
Kubernetes cluster for storage by setting useAllNodes to false. Additionally, we explicitly
specify that node01 should be used for storage by specifying node01 under the nodes
section.

kubectl create -f - <<EOF
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph
spec:
  cephVersion:
    image: ceph/ceph:v14.2.1-20190430
    allowUnsupported: true
dataDirHostPath: /var/lib/rook
placement:
   all:
     nodeAffinity:
       requiredDuringSchedulingIgnoredDuringExecution:
         nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
             operator: In
             values:
             - "node01"

mon:
  count: 1
  allowMultiplePerNode: true
dashboard:
  enabled: true
network:
  hostNetwork: false
rbdMirroring:
  workers: 0
storage:
  useAllNodes: false
  useAllDevices: false
  deviceFilter:
  config:
    databaseSizeMB: "1024" # this value can be removed for environments with normal sized disks (100 GB or larger)
    journalSizeMB: "1024" # this value can be removed for environments with normal sized disks (20 GB or larger)
    osdsPerDevice: "1" # this value can be overridden at the node or device level
  directories:
  - path: /var/lib/rook
  nodes:
  - name: "node01"
 EOF

Once the command above has been run, you can then verify that the cluster has been created and by verifying that the proper pods are running within the rook-ceph namespace. The following output shows the various pods in the rook-ceph namespace after the Ceph cluster was successfully deployed for this post:

$ kubectl get pods -n rook-ceph
NAME                                  READY STATUS     RESTARTS  AGE
rook-ceph-agent-mqhj8                 1/1   Running    0         2m21s
rook-ceph-mgr-a-5b747d89c6-4w8q2      1/1   Running    0         48s
rook-ceph-mon-a-57d58799b7-tpbdk      1/1   Running    0         66s
rook-ceph-operator-548b56f995-2mmlf   1/1   Running    0         2m23s
rook-ceph-osd-0-76fc8d587-q97tf       1/1   Running    0         16s
rook-ceph-osd-prepare-node01-ltm2g    0/2   Completed  0         23s
rook-discover-chhvl                   1/1   Running    0         2m21s

Creating the Storage Class

Lastly, a StorageClass needs to be created that will dynamically provision PersistentVolumes using the Ceph cluster we just created, along with a storage pool that can be utilized by that StorageClass. This is done using file storageclass-test.yaml, which will create a StorageClass called rook-ceph-block and a CephBlockPool storage pool called replicapool, both of which are suitable for testing our Ceph cluster:

kubectl create -f storageclass-test.yaml

Verifying & Analyzing the Ceph Cluster

With all required Rook resources now created, you can verify the health of the Ceph cluster by using the ceph command line tool built into the various Rook pods, e.g. the Rook operator pod:

# find the Rook operator pod
$ kubectl get pod -n rook-ceph --selector=app=rook-ceph-operator
NAME                                  READY    STATUS     RESTARTS  AGE
rook-ceph-operator-548b56f995-l7wtt    1/1     Running    0         15m

# view Ceph cluster health
$ kubectl exec -n rook-ceph -it rook-ceph-operator-548b56f995-l7wtt -- ceph status
 cluster:
   id: 9059353f-ba50-49c1-b8c9-acd334451627
   health: HEALTH_OK

services:
  mon: 1 daemons, quorum a (age 7m)
  mgr: a(active, since 6m)
  osds: 1 up (since 6m), 1 in (since 6m)

data:
  pools: 1 pools, 100 pgs
  objects: 0 objects, 0 B
  usage: 8.0 GiB used, 31 GiB / 39 GiB avail
  pgs: 100 active+clean

You can also show the free space statistics for the cluster using ceph df:

$ kubectl exec -n rook-ceph -it rook-ceph-operator-548b56f995-l7wtt -- ceph df
RAW STORAGE:
   CLASS    SIZE     AVAIL    USED      RAW USED     %RAWUSED
   hdd      39 GiB   31 GiB   8.0 GiB    8.0 GiB        20.45
   TOTAL    39 GiB   31 GiB   8.0 GiB    8.0 GiB        20.45

POOLS:
   POOL          ID    STORED    OBJECTS    USED    %USED  MAX AVAIL
   replicapool   1        0 B          0     0 B        0    29 GiB

As you can see, we have not yet used any storage from the replicapool storage pool.

Installing the PostgreSQL Operator

With a Rook cluster up and running, the PostgreSQL Operator can now be deployed. For this example, the PGO will be installed using Ansible. For additional information on installing the PostgreSQL Operator, as well as any of commands, configuration, etc. discussed below, please see the PostgreSQL Operator documentation.

In order to install the PGO according to the steps below, it is first necessary to clone the PGO github repository and checkout v4.0.1 of the PGO project:

git clone https://github.com/CrunchyData/postgres-operator.git
cd postgres-operator
git checkout 4.0.1

Next we can navigate to the ansible folder from the root of the PGO repository for the steps that follow:

cd ansible

Inventory Configuration

Before installing the PGO using Ansible, the inventory file (located in the ansible directory) must be configured as follows:

Specify the proper context for connecting to the Kubernetes cluster, e.g. the result of kubectl config current-context:
```
kubernetes_context=kubernetes-admin@kubernetes
```

Specify admin credentials for the PGO:

pgo_admin_username='pgouser'
pgo_admin_password='pgopass'

Specify a namespace for deploying the PGO. For this example, the PGO will be deployed to the
namespace pgo (the installer will create this namespace if it does not already exist):
```
pgo_operator_namespace='pgo'
```
Specify the namespace(s) managed by the PGO. Please note that the PGO will only be able to deploy PG clusters to the namespaces defined using this setting. For this example, the PGO will manage a single namespace called pgouser1 (which will also be created by the installer if it does not already exist):
```
namespace='pgouser1'
```

Specify that the storage3 storage configuration should be used for all storage required for a PG cluster:

backrest_storage='storage3'
backup_storage='storage3'
primary_storage='storage3'
replica_storage='storage3'

Set rook-ceph-block as the storage class for the storage3 for storage configuration. This will ensure the rook-ceph-block storage class, and therefore our Ceph cluster, is utilized by the PGO to provision any storage required for a PG cluster.
```
storage3_access_mode='ReadWriteOnce'
storage3_size='1Gi'
storage3_type='dynamic'
storage3_class='rook-ceph-block'
storage3_fs_group=26
```
Please note that storage3_size has also been changed from 1G to 1Gi. Depending on your version of Rook and Kubernetes, you might also need to make to same change in order to ensure any PersistentVolumeClaims created can successfully bind to a PersistentVolume. Please see the following issues for more details:
- rook/rook#3142
- kubernetes/kubernetes#78669

Ensure the PGO client installation is enabled:

pgo_client_install='true'
pgo_client_version='4.0.1'

Deploy the PGO

With the inventory file now configured, the Operator can be deployed as follows:

ansible-playbook -i inventory --tags=install --ask-become-pass main.yml

You can then verify that the PGO is running by verifying that all containers within the postgres-operator pod are running:

$ kubectl get pods -n pgo
NAME                                READY   STATUS   RESTARTS  AGE
postgres-operator-897cf98f8-ztqf7   3/3     Running   0        3m

Create a PostgreSQL Cluster With Rook Ceph Storage

With the PostgreSQL Operator now deployed, it is now possible to provision a new PostgreSQL cluster that utilizes the Rook Ceph Cluster currently deployed to the Kubernetes cluster. However, in order to use the PGO client a few local environment variables must first be set:

export PGO_NAMESPACE=pgouser1
export PGOUSER="${HOME?}/.pgo/pgo/pgouser"
export PGO_CA_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_CERT="${HOME?}/.pgo/pgo/client.crt"
export PGO_CLIENT_KEY="${HOME?}/.pgo/pgo/client.pem"
export PGO_APISERVER_URL='https://127.0.0.1:8443'

You will notice that the PGO_APISERVER_URL variable specifies the loopback address (127.0.0.1). This is because by default, the postgres-operator service utilized by the PGO API server is of type ClusterIP, which means the service is only accessible from within the Kubernetes cluster. In order to access the API server from outside of the cluster (i.e. from a client machine), it is necessary to use the kubectl port-forward command to forward a local port to a pod (among other available options for exposing a service, which are out-of-scope for this example). Therefore, for this example we will simply forward local port 8443 to port 8443 in the postgres-operator pod, allowing it to be accessed using localhost:

kubectl port-forward $(kubectl get pod -n pgo -o name) -n pgo 8443:8443

Once the port is being forwarded, the connection to the PGO API server can be tested as follows:

$ pgo version
pgo client version 4.0.1
pgo-apiserver version 4.0.1

Create a PostgreSQL Cluster

With the proper pgo client configuration now in place, a PG cluster can now be created with the following command:

pgo create cluster rookcluster1

You can verify then verify that the cluster is running by viewing the pods in the cluster:

$ kubectl get pods -n pgouser1
NAME                                                 READY STATUS   RESTARTS AGE
rookcluster1-7b75b56455-rst5x                        1/1   Running   0       3m3s
rookcluster1-backrest-shared-repo-7f5cd86c56-2fbnf   1/1   Running   0       3m3s
rookcluster1-stanza-create-szpqx                     0/1   Completed 0       82s

As can be seen above, by default the PostgreSQL Operator creates a pod containing the PG database, as well as a pod for a pgBackRest repository. Additionally, a job is run to perform the initial stanza creation for the pgBackRest repository. Please note that pgBackRest is the default backup and restore solution for any PostgreSQL cluster created using the PostgreSQL Operator as of version 4.0.1, although other backup and restore solutions, e.g. pg_basebackup, pg_dump and pg_restore, are available as well.

Looking at the PersistentVolumeClaims (PVC's) created for use by these various pods, you can see that two PVC's were created:

rookcluster1 - the PVC for the PostgreSQL database itself
rookcluster1-backrest-shared-repo - the PVC for the pgBackRest repository (used to store WAL
archives and backups)

$ kubectl get pvc -n pgouser1
NAME                              STATUS   VOLUME                                    CAPACITY   ACCESS MODES   STORAGECLASS     AGE
rookcluster1                      Bound    pvc-83a54676-6fe5-48f4-af97-860e690c1a23  1Gi        RWO            rook-ceph-block  4m19s
rookcluster1-backrest-shared-repo Bound    pvc-4d88f9df-a62c-49a7-ad81-69dc0546e9f4  1Gi        RWO            rook-ceph-block  4m16s

And each of these PVC's are bound to PersistentVolumes (PV's) provisioned using storage class rook-ceph-block, with the PV's providing the Ceph block storage required to run the PG cluster and its associated pods:

$ kubectl get pv -n pgouser1
NAME                                      CAPACITY  ACCESS MODES   RECLAIM POLICY  STATUS   CLAIM                                       STORAGECLASS      REASON   AGE
pvc-4d88f9df-a62c-49a7-ad81-69dc0546e9f4  1Gi       RWO            Delete          Bound    pgouser1/rookcluster1-backrest-shared-repo  rook-ceph-block            6m51s
pvc-83a54676-6fe5-48f4-af97-860e690c1a23  1Gi       RWO            Delete          Bound    pgouser1/rookcluster1                       rook-ceph-block            6m54s

Backup the PostgreSQL Cluster

At this point, the PostgreSQL cluster has been successfully deployed that is fully backed by Ceph block storage. We can now perform a few additional actions to demonstrate the provisioning of additional Ceph block storage using the rook-ceph CephCluster. For instance, we can create a pg_basebackup of the cluster the pgo backup command:

$ # backup the cluster
$ pgo backup rookcluster1 --backup-type=pgbasebackup
created backup Job for rookcluster1
workflow id 935acbba-a233-4dfb-a7a5-85a602fd9098

$ # wait for the backup job to complete
$ kubectl get job -n pgouser1 --selector=pgbackup=true
NAME                        COMPLETIONS  DURATION   AGE
backup-rookcluster1-ndml    1/1          42s        3m5s

This will create a new PVC named rookcluster1-backup, which will be bound to a new PV (again provisioned from the Ceph cluster) that will be used to store the backup:

$ kubectl get pvc -n pgouser1 rookcluster1-backup
NAME                 STATUS   VOLUME                                    CAPACITY ACCESS MODES STORAGECLASS     AGE
rookcluster1-backup  Bound    pvc-1c74042a-81cd-49cf-83ad-84e644546842   1Gi     RWO          rook-ceph-block  7m

Restore the PostgreSQL Cluster

Using the rookcluster1-backup PVC created when creating the pg_basebackup above, a restored database can now be created using the pgo restore command. This command will again create a new PVC, this time to store the restored database. Additionally, the PVC will be named rookcluster1-restored, as specified by the --restore-to-pvc flag:

$ # restore the cluster
$ pgo restore rookcluster1 --backup-type=pgbasebackup --backup-pvc=rookcluster1-backup --restore-to-pvc=rookcluster1-restored

$ # wait for the restore job to complete
$ kubectl get job -n pgouser1 --selector=pgo-pgbasebackup-restore=true
NAME                                    COMPLETIONS DURATION AGE
pgbasebackup-restore-rookcluster1-jncm   1/1        24s      6m26s

$ # view the pvc used to store the restored db
$ kubectl get pvc -n pgouser1 rookcluster1-restored
NAME                   STATUS  VOLUME                                     CAPACITY   ACCESS MODES  STORAGE CLASS    AGE
rookcluster1-restored  Bound   pvc-7afa9bbf-f634-44b7-abc4-8ac70c832051   1Gi        RWO           rook-ceph-block  7m48s

Finally, a new cluster can be created using this PVC, effectively creating the restored database. Please ensure the name of the new cluster matches the name of the PVC containing the restored database (rookcluster1-restored) when running the pgo create command:

pgo create cluster rookcluster1-restored --pgbackrest=false

Ceph Disk Stats

At this point we have two PG clusters running, both the original cluster (rookcluster1) and the restored cluster (rookcluster1-restored), and we also still have the pg_basebackup. Therefore, we should now expect to see storage consumed in the Ceph cluster when when we view the free space statistics for the cluster:

$ kubectl exec -it rook-ceph-operator-548b56f995-l7wtt -n rook-ceph -- ceph df
RAW STORAGE:
CLASS     SIZE    AVAIL    USED     RAW USED    %RAW USED
hdd       39 GiB  29 GiB   10 GiB   10 GiB      25.45
TOTAL     39 GiB  29 GiB   10 GiB   10 GiB      25.45

POOLS:
POOL         ID   STORED    OBJECTS   USED     %USED    MAX AVAIL
replicapool  1    274 MiB       124   274 MiB  0.97       27 GiB

As the above output shows, due to the various PVC's created above for PG databases, backups and restores, storage has indeed been consumed in the replicapool storage pool for our rook-ceph CephCluster.

Conclusion

In conclusion, this post demonstrates how the Crunchy PostgreSQL Operator can leverage Rook to provide an effective storage solution to meet the storage requirements for a PostgreSQL cluster. Specifically, the example above shows how the PostgreSQL Operator to can use Rook to leverage the power of Ceph to provide storage for all aspects of running and managing a PG cluster, whether it be storage for PG database itself, or storage for supporting activities and services such as backups, restores and WAL archiving.

Latest Articles

Using the PostgreSQL Operator with Rook Ceph Storage