Kubernetes

Persistent Storage for Kubernetes with Ceph RBD

Ceph RBD (RADOS Block Device) provides persistent block storage for Kubernetes pods. Each RBD volume is a block device mapped as /dev/rbd* on the node, formatted with ext4 or xfs, and mounted into a single pod at a time (ReadWriteOnce). This is the right choice for databases, stateful applications, and anything that needs dedicated high-IOPS storage. If you need shared storage where multiple pods read and write the same files simultaneously (ReadWriteMany), use CephFS persistent storage for Kubernetes instead.

Original content from computingforgeeks.com - post 49380

This guide walks through setting up Ceph RBD persistent storage on Kubernetes using the ceph-csi driver. We cover pool creation, ceph-csi installation via Helm, StorageClass configuration, dynamic provisioning, online volume expansion, snapshots, cloning, and a real MySQL deployment. The steps work on any Kubernetes distribution – kubeadm, k3s, k0s, or managed clusters.

Prerequisites

Before starting, confirm you have the following in place:

  • A running Ceph cluster with OSD nodes and monitor daemons (Ceph Tentacle v20.x or newer). See our guide on setting up a Ceph storage cluster on Rocky Linux 10 if you need to set one up.
  • A running Kubernetes cluster (any distribution – kubeadm, k3s, k0s) with kubectl access from the control plane.
  • Helm 3 installed on the Kubernetes control plane node.
  • Network connectivity from all Kubernetes nodes to Ceph monitors on ports 3300 and 6789 (TCP).
  • The rbd kernel module available on all Kubernetes nodes. Most Linux distributions ship this by default.

Our test environment uses a 3-node Ceph Tentacle (v20.2.0) cluster with monitors at 192.168.1.33, 192.168.1.31, and 192.168.1.32, and a k3s v1.34.5 cluster running on Rocky Linux 10.

Step 1: Create a Dedicated RBD Pool on Ceph

Start by creating a dedicated pool on the Ceph cluster for Kubernetes RBD volumes. Run these commands on any Ceph node with admin access.

ceph osd pool create kubernetes-rbd 32
ceph osd pool application enable kubernetes-rbd rbd
rbd pool init kubernetes-rbd

The first command creates a pool named kubernetes-rbd with 32 placement groups. The second registers the pool for RBD use, and rbd pool init initializes the RBD metadata. Adjust the PG count based on your cluster size – 32 works well for clusters with fewer than 5 OSDs.

Verify the pool exists and is tagged for RBD:

ceph osd pool ls detail | grep kubernetes-rbd

You should see the pool listed with application rbd in the output.

Step 2: Gather Ceph Cluster Information

The ceph-csi driver needs three pieces of information from your Ceph cluster: the cluster FSID, monitor addresses, and an authentication key.

Get the cluster FSID:

ceph fsid

This returns a UUID like 41709c12-262c-11f1-a563-bc2411261faa. Save this value.

Get the admin keyring:

ceph auth get-key client.admin

This returns the authentication key string. Keep it safe – you will need it for the Kubernetes Secret.

Confirm your monitor addresses:

ceph mon dump

Look for the v1: addresses (port 6789) in the output. In our cluster, the monitors are at 192.168.1.33:6789, 192.168.1.31:6789, and 192.168.1.32:6789.

Step 3: Install ceph-csi RBD Driver via Helm

The ceph-csi project provides a Helm chart that deploys the RBD CSI driver as a DaemonSet (node plugin) and a Deployment (provisioner). Run these commands on the Kubernetes control plane node.

Add the ceph-csi Helm repository and create the namespace:

helm repo add ceph-csi https://ceph.github.io/csi-charts
helm repo update
kubectl create namespace ceph-csi-rbd

Install the ceph-csi-rbd chart with your cluster’s FSID and monitor addresses:

helm install ceph-csi-rbd ceph-csi/ceph-csi-rbd \
  --namespace ceph-csi-rbd \
  --set csiConfig[0].clusterID=41709c12-262c-11f1-a563-bc2411261faa \
  --set 'csiConfig[0].monitors={192.168.1.33:6789,192.168.1.31:6789,192.168.1.32:6789}'

Replace the clusterID and monitor IPs with the values you collected in Step 2. The installation completes with a summary:

NAME: ceph-csi-rbd
LAST DEPLOYED: Sat Mar 22 23:18:15 2026
NAMESPACE: ceph-csi-rbd
STATUS: deployed
REVISION: 1

Verify the pods are running:

kubectl get pods -n ceph-csi-rbd

You should see a nodeplugin pod on each node and at least one provisioner pod:

NAME                                          READY   STATUS    RESTARTS   AGE
ceph-csi-rbd-nodeplugin-tm2ld                 3/3     Running   0          30s
ceph-csi-rbd-provisioner-5995d76745-z584l     7/7     Running   0          30s

The nodeplugin runs as a DaemonSet (one per node) and handles mounting RBD volumes on the node. The provisioner handles creating and deleting RBD images in the Ceph pool.

Single-node clusters: The provisioner Deployment defaults to 3 replicas with pod anti-affinity, so only one replica can schedule on a single-node cluster. Scale it down to avoid pending pods:

kubectl scale deployment ceph-csi-rbd-provisioner -n ceph-csi-rbd --replicas=1

Step 4: Create Ceph Authentication Secret

The CSI driver needs Ceph credentials to create and map RBD images. Create a Kubernetes Secret with the admin key you collected in Step 2.

Create a file named csi-rbd-secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: csi-rbd-secret
  namespace: ceph-csi-rbd
stringData:
  userID: admin
  userKey: AQAPT8BpV919LBAAnVPpCDQP0rbcR2qJNhpzag==

Replace the userKey value with your actual admin key. Apply it:

kubectl apply -f csi-rbd-secret.yaml

For production environments, create a dedicated Ceph user with limited permissions instead of using the admin key. A user with allow rwx pool=kubernetes-rbd capabilities is sufficient for the CSI driver.

Step 5: Create RBD StorageClass

The StorageClass tells Kubernetes how to provision RBD volumes – which pool to use, which secrets for authentication, and what features to enable.

Create a file named csi-rbd-sc.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-rbd-sc
provisioner: rbd.csi.ceph.com
parameters:
  clusterID: 41709c12-262c-11f1-a563-bc2411261faa
  pool: kubernetes-rbd
  imageFeatures: layering
  csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret
  csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-rbd
  csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-rbd
  csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret
  csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-rbd
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
  - discard

Apply it:

kubectl apply -f csi-rbd-sc.yaml

Key parameters explained:

  • imageFeatures: layering – Enables copy-on-write cloning and snapshots. This is the safest feature set with broad kernel support. Avoid adding exclusive-lock or object-map unless your kernel version supports them.
  • discard – Passes TRIM/discard operations from the filesystem to the RBD image, allowing Ceph to reclaim space when files are deleted inside the volume.
  • reclaimPolicy: Delete – When a PVC is deleted, the RBD image is automatically removed from Ceph. Use Retain in production if you want manual control over data deletion.
  • allowVolumeExpansion: true – Enables online volume resizing without downtime.

Verify the StorageClass was created:

kubectl get sc csi-rbd-sc

Step 6: Test with a PVC and Pod

Create a PersistentVolumeClaim to dynamically provision an RBD volume, then run a test pod that writes data to it.

Create rbd-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
  storageClassName: csi-rbd-sc

Note the ReadWriteOnce access mode – this is the correct mode for RBD. Block devices can only be mounted on one node at a time. Apply it:

kubectl apply -f rbd-pvc.yaml

Now create a test pod that writes data and reports filesystem details. Create rbd-test-pod.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: rbd-test-app
spec:
  containers:
    - name: test
      image: busybox
      command: ["/bin/sh", "-c"]
      args:
        - |
          echo "Ceph RBD block storage working!" > /data/test.txt
          date >> /data/test.txt
          hostname >> /data/test.txt
          cat /data/test.txt
          echo "---"
          df -h /data
          echo "---"
          mount | grep /data
          sleep 3600
      volumeMounts:
        - name: rbd-vol
          mountPath: /data
  volumes:
    - name: rbd-vol
      persistentVolumeClaim:
        claimName: rbd-pvc

Apply the pod:

kubectl apply -f rbd-test-pod.yaml

Check that the PVC is bound:

kubectl get pvc rbd-pvc

The PVC should show Bound status with the requested 2Gi capacity:

NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
rbd-pvc   Bound    pvc-eaee888b-69bf-4d3f-85c1-e71dfb1a3e2f   2Gi        RWO            csi-rbd-sc     10s

Once the pod is running, check the logs to confirm the RBD volume is mounted and working:

kubectl logs rbd-test-app

The output shows the data written to the volume and confirms it is a real block device with ext4 formatting:

Ceph RBD block storage working!
Sun Mar 22 23:23:12 UTC 2026
rbd-test-app
---
Filesystem                Size      Used Available Use% Mounted on
/dev/rbd0                 1.9G     28.0K      1.9G   0% /data
---
/dev/rbd0 on /data type ext4 (rw,seclabel,relatime,discard,stripe=1024)

Notice /dev/rbd0 – this is a real block device on the node, not a network filesystem mount. The kernel rbd module maps the Ceph RBD image as a local block device, which is then formatted with ext4 and mounted into the container. The discard mount option confirms TRIM support is active.

Step 7: Online Volume Expansion

One of the advantages of ceph-csi is the ability to resize volumes without deleting them. Since we set allowVolumeExpansion: true in the StorageClass, we can grow the PVC from 2Gi to 5Gi with a single patch.

kubectl patch pvc rbd-pvc -p '{"spec":{"resources":{"requests":{"storage":"5Gi"}}}}'

For RBD volumes, the underlying Ceph image is resized immediately, but the ext4/xfs filesystem resize happens when the volume is next mounted (or when the pod restarts). Delete the test pod and recreate it to trigger the filesystem expansion:

kubectl delete pod rbd-test-app
kubectl apply -f rbd-test-pod.yaml

After the pod restarts, check the logs again:

kubectl logs rbd-test-app

The filesystem now shows 4.9G available, and the data written earlier persists across the restart:

Ceph RBD block storage working!
Sun Mar 22 23:23:12 UTC 2026
rbd-test-app
---
Filesystem                Size      Used Available Use% Mounted on
/dev/rbd0                 4.9G     32.0K      4.8G   0% /data
---
/dev/rbd0 on /data type ext4 (rw,seclabel,relatime,discard,stripe=1024)

The original timestamp (Mar 22 23:23:12) is still in test.txt, confirming data persistence through the pod restart and volume resize.

Step 8: Volume Snapshots

RBD snapshots let you capture point-in-time copies of a volume. These can be used for backups or to clone new volumes with existing data.

Install Snapshot CRDs and Controller

Some Kubernetes distributions (like k3s) do not ship the VolumeSnapshot CRDs by default. Install them if they are not already present on your cluster:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

If you are running kubeadm-based clusters, these CRDs may already exist. Running the apply commands again is safe – Kubernetes will report them as unchanged.

Create a VolumeSnapshotClass

The VolumeSnapshotClass defines how snapshots are taken and which credentials to use. Create rbd-snapclass.yaml:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-rbd-snapclass
driver: rbd.csi.ceph.com
parameters:
  clusterID: 41709c12-262c-11f1-a563-bc2411261faa
  csi.storage.k8s.io/snapshotter-secret-name: csi-rbd-secret
  csi.storage.k8s.io/snapshotter-secret-namespace: ceph-csi-rbd
deletionPolicy: Delete

Apply it:

kubectl apply -f rbd-snapclass.yaml

Take a Snapshot

First, write some meaningful data to the test volume so we can verify the snapshot captures it. Exec into the running pod:

kubectl exec rbd-test-app -- sh -c 'echo "Critical database backup" > /data/backup.sql'

Now create a snapshot. Save this as rbd-snapshot.yaml:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: rbd-snapshot
spec:
  volumeSnapshotClassName: csi-rbd-snapclass
  source:
    persistentVolumeClaimName: rbd-pvc

Apply and check the status:

kubectl apply -f rbd-snapshot.yaml
kubectl get volumesnapshot rbd-snapshot

The snapshot should show true under READYTOUSE within a few seconds:

NAME           READYTOUSE   SOURCEPVC   RESTORESIZE   SNAPSHOTCLASS       AGE
rbd-snapshot   true         rbd-pvc     5Gi           csi-rbd-snapclass   15s

The snapshot is stored as a Ceph RBD snapshot within the same pool, which is space-efficient since it only tracks the differences from the current state of the image.

Step 9: Clone a Volume from a Snapshot

Cloning creates a new PVC pre-populated with data from a snapshot. This is useful for disaster recovery, creating test environments from production data, or rolling back to a known good state.

Create rbd-clone-pvc.yaml with a dataSource pointing to the snapshot:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: rbd-clone
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
  storageClassName: csi-rbd-sc
  dataSource:
    name: rbd-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Apply and verify the clone PVC is bound:

kubectl apply -f rbd-clone-pvc.yaml
kubectl get pvc rbd-clone

The cloned PVC should bind to a new PV within a few seconds:

NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
rbd-clone   Bound    pvc-e4071364-5e9a-4b8c-b74f-2fa6ad3ecba2   5Gi        RWO            csi-rbd-sc     5s

Now deploy a pod that reads from the cloned volume to verify the snapshot data was restored. Create clone-reader.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: clone-reader
spec:
  containers:
    - name: reader
      image: busybox
      command: ["/bin/sh", "-c"]
      args:
        - |
          echo "Cloned volume contents:"
          ls -la /data/
          echo "---"
          cat /data/backup.sql
          sleep 3600
      volumeMounts:
        - name: clone-vol
          mountPath: /data
  volumes:
    - name: clone-vol
      persistentVolumeClaim:
        claimName: rbd-clone

Apply and check the logs:

kubectl apply -f clone-reader.yaml
kubectl logs clone-reader

The cloned volume contains all the data from the snapshot:

Cloned volume contents:
-rw-r--r--    1 root     root            25 backup.sql
-rw-r--r--    1 root     root            74 test.txt
---
Critical database backup

The backup.sql and test.txt files from the original volume are fully intact in the clone. This confirms the snapshot and clone pipeline works end to end.

Step 10: MySQL with Ceph RBD Storage

A real-world test with a database workload validates that RBD storage handles actual application I/O. This section deploys MySQL 8.4 with its data directory backed by a Ceph RBD PVC.

Create mysql-rbd.yaml with all the resources in a single file:

apiVersion: v1
kind: Secret
metadata:
  name: mysql-secret
stringData:
  MYSQL_ROOT_PASSWORD: StrongP@ssw0rd
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-rbd-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: csi-rbd-sc
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql-rbd
spec:
  selector:
    matchLabels:
      app: mysql-rbd
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: mysql-rbd
    spec:
      containers:
        - name: mysql
          image: mysql:8.4
          env:
            - name: MYSQL_ROOT_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: mysql-secret
                  key: MYSQL_ROOT_PASSWORD
          ports:
            - containerPort: 3306
          volumeMounts:
            - name: mysql-data
              mountPath: /var/lib/mysql
      volumes:
        - name: mysql-data
          persistentVolumeClaim:
            claimName: mysql-rbd-pvc
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-rbd-svc
spec:
  selector:
    app: mysql-rbd
  ports:
    - port: 3306
      targetPort: 3306

The Deployment uses strategy: Recreate because RBD volumes are ReadWriteOnce – only one pod can mount the volume at a time. A rolling update strategy would fail since the new pod cannot mount the volume while the old pod still holds it.

Apply and wait for MySQL to start:

kubectl apply -f mysql-rbd.yaml
kubectl get pods -l app=mysql-rbd -w

MySQL takes about 30-60 seconds to initialize on first run. Once running:

NAME                         READY   STATUS    RESTARTS   AGE
mysql-rbd-867465f79d-mkf2z   1/1     Running   0          45s

Connect to MySQL and create some test data:

kubectl exec -it deploy/mysql-rbd -- mysql -uroot -p'StrongP@ssw0rd' -e "
SELECT VERSION();
CREATE DATABASE testdb;
USE testdb;
CREATE TABLE users (id INT AUTO_INCREMENT PRIMARY KEY, name VARCHAR(50), created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP);
INSERT INTO users (name) VALUES ('Alice'), ('Bob'), ('Charlie');
SELECT * FROM users;"

MySQL confirms version 8.4 and the data is written successfully:

VERSION()
8.4.8
id	name	created_at
1	Alice	2026-03-22 23:28:35
2	Bob	2026-03-22 23:28:35
3	Charlie	2026-03-22 23:28:35

This MySQL instance is now running with full data persistence on Ceph RBD. If the pod is deleted, rescheduled, or the node goes down, Kubernetes mounts the same RBD image on the new node and MySQL recovers with all data intact.

Verify RBD Images on the Ceph Side

You can see all the Kubernetes-provisioned RBD images directly on the Ceph cluster. Run this on any Ceph node:

rbd ls kubernetes-rbd

The output shows every PVC and snapshot created by ceph-csi, using the csi-vol- and csi-snap- naming convention:

csi-vol-27971a93-6f8e-4c2a-9d5b-3e8a1b2c4d5e
csi-vol-6ce45bcc-a1b2-3c4d-e5f6-7a8b9c0d1e2f
csi-vol-a777b17f-1234-5678-9abc-def012345678
csi-snap-a977f99b-abcd-ef01-2345-6789abcdef01

You can inspect individual images for size and feature details with rbd info kubernetes-rbd/IMAGE_NAME. For more details on RBD administration, see the official Ceph RBD documentation.

RBD vs CephFS – Choosing the Right Storage

Ceph provides two distinct storage backends for Kubernetes, each designed for different workloads. Pick the one that matches your application’s access pattern.

FeatureRBD (Block Storage)CephFS (Shared Filesystem)
Access ModeReadWriteOnce (single pod)ReadWriteMany (multiple pods)
Use CaseDatabases, single-pod stateful appsShared files, web content, CMS uploads
Filesystemext4/xfs on a block device (/dev/rbd*)POSIX distributed filesystem
PerformanceHigher IOPS, lower latency (direct block I/O)Good throughput, slightly higher latency
SnapshotsPer-volume snapshots and clonesSubvolume snapshots
Volume ExpansionOnline resize supportedOnline resize supported

Use RBD when your application needs dedicated block-level storage and only one pod accesses the volume at a time. Use CephFS when multiple pods need to read and write the same files simultaneously. You can run both drivers on the same Kubernetes cluster pointing to the same Ceph cluster.

Conclusion

Ceph RBD with ceph-csi gives Kubernetes reliable, dynamically provisioned block storage backed by a distributed storage cluster. We set up the full pipeline – pool creation, CSI driver installation, StorageClass, dynamic provisioning, volume expansion, snapshots, cloning, and a production-grade MySQL deployment.

For production use, create a dedicated Ceph user instead of using client.admin, switch the reclaim policy to Retain for critical data, set up scheduled VolumeSnapshot CronJobs for automated backups, and monitor Ceph cluster health with Prometheus and Grafana.

Related Articles

Backup Backup Linux, Mac, and Windows Systems using Duplicati Containers Install Dokku (Docker PaaS) on Ubuntu 24.04|22.04|20.04 Containers How To Use Tini Init system in Docker Containers Debian How To Configure NFS Server on  Debian 12 (Bookworm)

5 thoughts on “Persistent Storage for Kubernetes with Ceph RBD”

  1. Hello,
    great guidelines but I guess you forgot to create ServiceAccount:

    apiVersion: v1
    kind: ServiceAccount
    metadata:
    name: rbd-provisioner
    namespace: kube-system

    Reply
  2. The ServiceAccount needs to be in the kube-system namespace or the example won’t work, as Pavel indicated.
    I tried the recipe but the rbd-provisioner pod throws an error:
    controller.go:1004] provision “default/ceph-rbd-claim1” class “ceph-rbd”: unexpected error getting claim reference: selfLink was empty, can’t make reference

    I believe selfLink was deprecated in k8s, but I couldn’t find a more up-to-date rbd-provisioner

    Reply
  3. I followed the guide but unlucky …I have created PVC but no pv created automatically ….looks like binding didn’t work …my pvc status —

    [rke@kube-master rke-k8s]$ kubectl get pvc
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    ceph-rbd-mm-jenkins Pending ceph-rbd 33m

    [rke@kube-master rke-k8s]$ cat ceph-rbd-claim.yml
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
    name: ceph-rbd-mm-jenkins
    spec:
    accessModes:
    – ReadWriteOnce
    storageClassName: ceph-rbd
    resources:
    requests:
    storage: 1Gi
    [rke@kube-master rke-k8s]$

    pvc description —

    Events:
    Type Reason Age From Message
    —- —— —- —- ——-
    Normal ExternalProvisioning 3m37s (x122 over 33m) persistentvolume-controller waiting for a volume to be created, either by external provisioner “ceph.com/rbd” or manually created by system administrator
    [rke@kube-master rke-k8s]$

    [rke@kube-master rke-k8s]$ cat ceph-rbd-claim.yml
    kind: PersistentVolumeClaim
    apiVersion: v1
    metadata:
    name: ceph-rbd-mm-jenkins
    spec:
    accessModes:
    – ReadWriteOnce
    storageClassName: ceph-rbd
    resources:
    requests:
    storage: 1Gi
    [rke@kube-master rke-k8s]$

    can someone help me ..I followed all but the binding is not working looks like …

    Reply

Leave a Comment

Press ESC to close