A technical troubleshooting blog about Oracle with other Databases & Cloud Technologies.

Deploy CNPG Postgres on Kubernetes Cluster !!

2 min read
Config
Master Nodes: 1
Worker Nodes: 3
DB Cluster Nodes: 2
Postgres Operator: CNPG
Cluster Mode: 1 primary 1 standby

Cluster Details:

HostnameIPNode NameRoleExternal Storage
master.localdomain 192.168.115.190k3s-mastermasterNA
worker1.localdomain 192.168.115.195k3s-worker1worker200GiB
worker2.localdomain  192.168.115.196k3s-worker2worker200GiB
worker3.localdomain  192.168.115.197k3s-worker3worker200GiB

Entry in /etc/hosts:

192.168.115.190         master.localdomain
192.168.115.195         worker1.localdomain
192.168.115.196         worker2.localdomain
192.168.115.197         worker3.localdomain

Create VG from external disks in 3 worker nodes:

###Worker1
#Check for the disk name
[avs@worker1 ~]$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
nvme0n1     259:0    0  512G  0 disk
├─nvme0n1p1 259:1    0    1G  0 part /boot
└─nvme0n1p2 259:2    0  511G  0 part
  ├─rl-root 253:0    0   70G  0 lvm  /
  ├─rl-swap 253:1    0    2G  0 lvm  [SWAP]
  └─rl-home 253:2    0  439G  0 lvm  /home
nvme0n2     259:3    0  200G  0 disk
#Create LVM Linux Partition:
[avs@worker1 ~]$ sudo fdisk /dev/nvme0n2

Welcome to fdisk (util-linux 2.37.4).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0x0c0bf187.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1):
First sector (2048-419430399, default 2048):
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-419430399, default 419430399):

Created a new partition 1 of type 'Linux' and of size 200 GiB.

Command (m for help): t
Selected partition 1
Hex code or alias (type L to list all): 8e
Changed type of partition 'Linux' to 'Linux LVM'.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
#Check partition name 
[avs@worker1 ~]$ lsblk
NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
nvme0n1     259:0    0  512G  0 disk
├─nvme0n1p1 259:1    0    1G  0 part /boot
└─nvme0n1p2 259:2    0  511G  0 part
  ├─rl-root 253:0    0   70G  0 lvm  /
  ├─rl-swap 253:1    0    2G  0 lvm  [SWAP]
  └─rl-home 253:2    0  439G  0 lvm  /home
nvme0n2     259:3    0  200G  0 disk
└─nvme0n2p1 259:5    0  200G  0 part

#Create PV 
[avs@worker1 ~]$ sudo pvcreate /dev/nvme0n2p1
  Physical volume "/dev/nvme0n2p1" successfully created.

[avs@worker1 ~]$ sudo pvs
  PV             VG Fmt  Attr PSize    PFree
  /dev/nvme0n1p2 rl lvm2 a--  <511.00g       0
  /dev/nvme0n2p1    lvm2 ---  <200.00g <200.00g
#Create VG with name data_VG
[avs@worker1 ~]$ sudo vgcreate data_vg /dev/nvme0n2p1
  Volume group "data_vg" successfully created
  
[avs@worker1 ~]$ sudo vgs
  VG      #PV #LV #SN Attr   VSize    VFree
  data_vg   1   0   0 wz--n- <200.00g <200.00g
  rl        1   3   0 wz--n- <511.00g       0

Install k3s on the system:

########################################
#               Master                 #
########################################
#!/bin/bash

IPV4=192.168.115.190
TOKEN=uqUvReRK3Fjkc7YZYWF2eKJ9elbyEoE1Lh3mkzXfMukg4v0I0idWRMQUXXEVI5vG

curl -sfL https://get.k3s.io | sh -s - server \
    --node-ip=${IPV4} \
    --node-name=k3s-master \
    --tls-san=${IPV4} \
    --token=${TOKEN} \
    --cluster-cidr=10.44.0.0/16 \
    --service-cidr=10.45.0.0/16 \
    --write-kubeconfig-mode "0644" \
    --disable servicelb \
    --disable traefik

########################################
#               Worker1                #
########################################
#!/bin/bash

IP4=192.168.115.195
TOKEN=uqUvReRK3Fjkc7YZYWF2eKJ9elbyEoE1Lh3mkzXfMukg4v0I0idWRMQUXXEVI5vG

curl -sfL https://get.k3s.io | sh -s - agent \
    --token=${TOKEN} \
    --server "https://192.168.115.190:6443" \
    --node-ip=${IP4} \
    --node-name=k3s-db1


########################################
#               Worker2                #
########################################
#!/bin/bash

IP4=192.168.115.196
TOKEN=uqUvReRK3Fjkc7YZYWF2eKJ9elbyEoE1Lh3mkzXfMukg4v0I0idWRMQUXXEVI5vG

curl -sfL https://get.k3s.io | sh -s - agent \
    --token=${TOKEN} \
    --server "https://192.168.115.190:6443" \
    --node-ip=${IP4} \
    --node-name=k3s-db2


########################################
#               Worker3                #
########################################
#!/bin/bash

IP4=192.168.115.197
TOKEN=uqUvReRK3Fjkc7YZYWF2eKJ9elbyEoE1Lh3mkzXfMukg4v0I0idWRMQUXXEVI5vG

curl -sfL https://get.k3s.io | sh -s - agent \
    --token=${TOKEN} \
    --server "https://192.168.115.190:6443" \
    --node-ip=${IP4} \
    --node-name=k3s-db3

Run the script on relevant nodes.

Install helm:

[avs@master ~]$ curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
[avs@master ~]$ sh get_helm.sh
[WARNING] Could not find git. It is required for plugin installation.
Downloading https://get.helm.sh/helm-v3.15.2-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm
[avs@master ~]$ export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
[avs@master ~]$ helm ls
WARNING: Kubernetes configuration file is group-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
WARNING: Kubernetes configuration file is world-readable. This is insecure. Location: /etc/rancher/k3s/k3s.yaml
NAME    NAMESPACE       REVISION        UPDATED STATUS  CHART   APP VERSION
[avs@master ~]$

Check the nodes if they are added in the cluster:

[avs@master ~]$ kubectl get nodes
NAME         STATUS   ROLES                  AGE     VERSION
k3s-master   Ready    control-plane,master   3m27s   v1.29.5+k3s1
k3s-db3      Ready    <none>                 37s     v1.29.5+k3s1
k3s-db2      Ready    <none>                 21s     v1.29.5+k3s1
k3s-db1      Ready    <none>                 14s     v1.29.5+k3s1

Taint and Label the Nodes:

#Taint Master Nodes to not schedule any other pod

kubectl taint nodes k3s-master node-role.kubernetes.io/master=true:NoSchedule

#Taint 2 nodes to run only pods that has toleration for DB

kubectl taint nodes k3s-db1 db=true:NoSchedule
kubectl taint nodes k3s-db2 db=true:NoSchedule

#Label the Worker Nodes

kubectl label nodes k3s-db1 app=db
kubectl label nodes k3s-db2 app=db
kubectl label nodes k3s-db3 app=zonos

**Master node can’t host any pod on it**k3s-db1 & k3s-db2 nodes are tainted so that only pods with toleration for db will be scheduled.

Install OpenEBS and Create SC for worker pods storage:

deployment/
└── openebs
    ├── dep_ebs.yaml
    ├── helm_upgrade_ebs.yaml
    └── openebs-4.0.1.tgz

Install helm chart:

High level Steps

• helm  repo add openebs https://openebs.github.io/openebs

#Adds the OpenEBS repository to Helm to make OpenEBS charts available for installation.

• helm repo update

#Updates the list of charts from all added Helm repositories, ensuring you have the latest versions.

• helm install openebs --namespace openebs openebs/openebs --create-namespace

#Installs OpenEBS using Helm in the specified namespace, creating the namespace if it doesn't exist.

#if not found then download the chart:

• helm install openebs --namespace openebs /path/to/deployment/openebs/openebs-4.0.1.tgz --create-namespace

Create deployment and helm upgrade file:

dep_ebs.yaml

#vi dep_ebs.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-database
allowedTopologies:
  - matchLabelExpressions:
    - key: openebs.io/nodename
      values:
        - k3s-db1
        - k3s-db2
allowVolumeExpansion: true
provisioner: local.csi.openebs.io
parameters:
  storage: "lvm"
  volgroup: "data_vg"
  fsType: xfs
#  thinProvision: "yes"
volumeBindingMode: WaitForFirstConsumer

---

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-app
allowedTopologies:
  - matchLabelExpressions:
    - key: openebs.io/nodename
      values:
        - k3s-db3
allowVolumeExpansion: true
provisioner: local.csi.openebs.io
parameters:
  storage: "lvm"
  volgroup: "data_vg"
  fsType: xfs
#  thinProvision: "yes"
volumeBindingMode: WaitForFirstConsumer

helm_ebs.yaml

#helm_ebs.yaml
lvm-localpv:
  lvmNode:
    tolerations:
      - key: "db"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"

engines:
  local:
    zfs:
      enabled: false
  replicated:
    mayastor:
      enabled: false

Executions:

[avs@master ~]$ helm  repo add openebs https://openebs.github.io/openebs
"openebs" has been added to your repositories

[avs@master ~]$ helm install openebs --namespace openebs openebs/openebs --create-namespace
NAME: openebs
LAST DEPLOYED: Sat Jun 15 14:23:43 2024
NAMESPACE: openebs
STATUS: deployed
REVISION: 1
NOTES:
Successfully installed OpenEBS.
Check the status by running: kubectl get pods -n openebs
The default values will install both Local PV and Replicated PV. However,
the Replicated PV will require additional configuration to be fuctional.
The Local PV offers non-replicated local storage using 3 different storage
backends i.e HostPath, LVM and ZFS, while the Replicated PV provides one replicated highly-available
storage backend i.e Mayastor.
For more information,
- view the online documentation at https://openebs.io/docs
- connect with an active community on our Kubernetes slack channel.
        - Sign up to Kubernetes slack: https://slack.k8s.io
        - #openebs channel: https://kubernetes.slack.com/messages/openebs

[avs@master ~]$ kubectl apply -f /home/avs/deployment/openebs/dep_ebs.yaml
storageclass.storage.k8s.io/openebs-database created
storageclass.storage.k8s.io/openebs-app created

[avs@master ~]$ helm upgrade openebs --namespace openebs /home/avs/deployment/openebs/openebs-4.0.1.tgz -f /home/avs/deployment/openebs/helm_upgrade_ebs.yaml
Release "openebs" has been upgraded. Happy Helming!
NAME: openebs
LAST DEPLOYED: Sat Jun 15 14:28:06 2024
NAMESPACE: openebs
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Successfully installed OpenEBS.

Check the status by running: kubectl get pods -n openebs

The default values will install both Local PV and Replicated PV. However,
the Replicated PV will require additional configuration to be fuctional.
The Local PV offers non-replicated local storage using 3 different storage
backends i.e HostPath, LVM and ZFS, while the Replicated PV provides one replicated highly-available
storage backend i.e Mayastor.

For more information,
- view the online documentation at https://openebs.io/docs
- connect with an active community on our Kubernetes slack channel.
        - Sign up to Kubernetes slack: https://slack.k8s.io
        - #openebs channel: https://kubernetes.slack.com/messages/openebs

[avs@master ~]$ kubectl get sc
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  54m
openebs-hostpath       openebs.io/local        Delete          WaitForFirstConsumer   false                  13m
openebs-database       local.csi.openebs.io    Delete          WaitForFirstConsumer   true                   9m46s
openebs-app            local.csi.openebs.io    Delete          WaitForFirstConsumer   true                   9m46s

Install CNPG Operator and Create Database:

helm repo add cnpg https://cloudnative-pg.github.io/charts

#helm upgrade --install cnpg --namespace cnpg-system --create-namespace cnpg/cloudnative-pg

kubectl apply -n ugvcl-prod -f /home/avs/deployment/postgres/postgres.yaml

[avs@master ~]$ tree deployment/
deployment/
├── cnpg
│   └── db_deployment.yaml
└── openebs
    ├── dep_ebs.yaml
    ├── helm_upgrade_ebs.yaml
    └── openebs-4.0.1.tgz

db_deployment.yaml

apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: database
  namespace: ugvcl-prod
spec:
  instances: 2
  imageName: ghcr.io/cloudnative-pg/postgresql:16.2
  affinity:
    nodeSelector:
      app: "db"
    tolerations:
      - key: "db"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"
  bootstrap:
    initdb:
      database: zonos_oss
      owner: zonos_oss
      postInitSQL:
        - create database customer with owner zonos_oss
      dataChecksums: true
      encoding: 'UTF8'
  enableSuperuserAccess: true
  # monitoring:  # enable only after kube-prometheus-stack is installed
  #   enablePodMonitor: true
  postgresql:
    parameters:
      checkpoint_timeout: "15min"
      max_connections: "1024"
      max_locks_per_transaction: "1024"
      max_wal_size: "64GB"
      random_page_cost: "1"
      shared_buffers: "40GB"
  resources:
    requests:
      cpu: "15"
      memory: "56Gi"
    limits:
      cpu: "15"
      memory: "56Gi"
  storage:
    size: 720Gi
    storageClass: openebs-database

Execution:

[avs@master ~]$ helm repo add cnpg https://cloudnative-pg.github.io/charts
"cnpg" has been added to your repositories
[avs@master ~]$ helm upgrade --install cnpg --namespace cnpg-system --create-namespace cnpg/cloudnative-pg
Release "cnpg" does not exist. Installing it now.
NAME: cnpg
LAST DEPLOYED: Sat Jun 15 14:44:50 2024
NAMESPACE: cnpg-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
CloudNativePG operator should be installed in namespace "cnpg-system".
You can now create a PostgreSQL cluster with 3 nodes in the current namespace as follows:

cat <<EOF | kubectl apply -f -
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: cluster-example
spec:
  instances: 3
  storage:
    size: 1Gi
EOF

kubectl get cluster

[avs@master ~]$ kubectl create ns ugvcl-prod
namespace/ugvcl-prod created
[avs@master ~]$ kubectl apply -f /home/avs/deployment/cnpg/db_deployment.yaml
cluster.postgresql.cnpg.io/database created

[avs@master ~]$ kubectl get pods -n ugvcl-prod
NAME         READY   STATUS    RESTARTS   AGE
database-1   1/1     Running   0          87s
database-2   1/1     Running   0          13s

We can see the database is deployed.

#Check which pod is primary:

[avs@master ~]$ kubectl exec -it database-1 -n ugvcl-prod -- psql -U postgres -c "SELECT pg_is_in_recovery();"
Defaulted container "postgres" out of: postgres, bootstrap-controller (init)
 pg_is_in_recovery
-------------------
 f
(1 row)

[avs@master ~]$ kubectl exec -it database-2 -n ugvcl-prod -- psql -U postgres -c "SELECT pg_is_in_recovery();"
Defaulted container "postgres" out of: postgres, bootstrap-controller (init)
 pg_is_in_recovery
-------------------
 t
(1 row)

Let’s check services:

[avs@master ~]$ kubectl get svc -n ugvcl-prod
NAME          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
database-r    ClusterIP   10.45.114.176   <none>        5432/TCP   9m52s
database-ro   ClusterIP   10.45.172.227   <none>        5432/TCP   9m52s
database-rw   ClusterIP   10.45.249.138   <none>        5432/TCP   9m52s

database-rw is the one which has read-write privilege, but database-rw service isn’t accessible outside the cluster so let’s create a service which forwards the port 5432 to 32000 using node port:

external-np.yaml

apiVersion: v1
kind: Service
metadata:
  name: database-rw-external
  namespace: ugvcl-prod
spec:
  type: NodePort
  selector:
    cnpg.io/cluster: database
    role: primary
  ports:
    - protocol: TCP
      port: 5432
      targetPort: 5432
      nodePort: 32000  # Choose an available port in the range 30000-32767

Apply and check if the service is created:

[avs@master ~]$ kubectl apply -f deployment/cnpg/external_np.yaml -n ugvcl-prod
service/database-rw-external created
[avs@master ~]$ kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.45.0.1    <none>        443/TCP   76m
[avs@master ~]$ kubectl get svc -n ugvcl-prod
NAME                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
database-r             ClusterIP   10.45.114.176   <none>        5432/TCP         13m
database-ro            ClusterIP   10.45.172.227   <none>        5432/TCP         13m
database-rw            ClusterIP   10.45.249.138   <none>        5432/TCP         13m
database-rw-external   NodePort    10.45.8.145     <none>        5432:32000/TCP   10s

Let’s check if the postgres service is accessible from outside:

[avs@master ~]$ hostname -i
192.168.115.190
[avs@master ~]$ telnet 192.168.115.190 32000
Trying 192.168.115.190...
Connected to 192.168.115.190.
Escape character is '^]'.

Hope it helped !! 🙂