How To

CephFS Persistent Storage for Kubernetes with ceph-csi

CephFS is the POSIX-compliant filesystem built on top of Ceph’s distributed object store. When paired with the ceph-csi driver, it gives Kubernetes pods access to shared persistent storage that supports ReadWriteMany access – something block storage (RBD) cannot do. This makes CephFS the right choice for workloads where multiple pods need to read and write the same files simultaneously.

Original content from computingforgeeks.com - post 49509

This guide walks through setting up CephFS persistent storage on Kubernetes using the ceph-csi driver installed via Helm. We cover driver installation, StorageClass configuration, dynamic provisioning, ReadWriteMany sharing between pods, online volume expansion, and volume snapshots. The setup uses Ceph Tentacle (v20.x) and works with any Kubernetes distribution – k3s, kubeadm, RKE2, or managed clusters.

Prerequisites

Before starting, confirm the following are in place:

  • A running Ceph cluster with at least one CephFS filesystem created. See our guide on setting up Ceph on Rocky Linux 10 / AlmaLinux 10
  • A Kubernetes cluster (k3s, kubeadm, or any distribution) with kubectl access
  • Helm 3 installed on the machine where you run kubectl
  • Network connectivity between all Kubernetes nodes and Ceph monitor nodes on ports 3300 (v2 messenger) and 6789 (v1 messenger)
  • Ceph admin credentials (or a dedicated CephFS user with appropriate caps)

Step 1: Gather Ceph Cluster Information

You need four pieces of information from your Ceph cluster: the cluster FSID, admin key, monitor addresses, and CephFS filesystem name. Run these commands on any Ceph node.

Get the cluster FSID:

ceph fsid

This returns the unique cluster identifier:

41709c12-262c-11f1-a563-bc2411261faa

Get the admin authentication key:

ceph auth get-key client.admin

The output is the base64-encoded key you will use in the Kubernetes Secret:

AQAPT8BpV919LBAAnVPpCDQP0rbcR2qJNhpzag==

List the monitor addresses:

ceph mon dump

Look for the mon lines showing each monitor’s IP and port. In our cluster the monitors are at 192.168.1.33, 192.168.1.31, and 192.168.1.32 on port 6789.

Verify the CephFS filesystem exists:

ceph fs ls

You should see your filesystem listed:

name: cephfs, metadata pool: cephfs.cephfs.meta, data pools: [cephfs.cephfs.data ]

Save these values – you will use them throughout this guide.

Step 2: Create CephFS Subvolume Group

The ceph-csi driver manages storage through CephFS subvolumes. It needs a subvolume group to organize them. Create one called csi under your CephFS filesystem:

ceph fs subvolumegroup create cephfs csi

This is a required step. Without this subvolume group, the ceph-csi provisioner will fail when creating PersistentVolumes.

Step 3: Install ceph-csi CephFS Driver via Helm

The ceph-csi project provides Helm charts for both RBD and CephFS drivers. Add the chart repository and install the CephFS driver.

Add the ceph-csi Helm repository and update:

helm repo add ceph-csi https://ceph.github.io/csi-charts
helm repo update

Create a dedicated namespace for the CephFS driver:

kubectl create namespace ceph-csi-cephfs

Install the chart with your Ceph cluster details. Replace the FSID, subvolume group, and monitor IPs with the values from Step 1:

helm install ceph-csi-cephfs ceph-csi/ceph-csi-cephfs \
  --namespace ceph-csi-cephfs \
  --set csiConfig[0].clusterID=41709c12-262c-11f1-a563-bc2411261faa \
  --set 'csiConfig[0].cephFS.subvolumeGroup=csi' \
  --set 'csiConfig[0].monitors={192.168.1.33:6789,192.168.1.31:6789,192.168.1.32:6789}'

Helm confirms the deployment:

NAME: ceph-csi-cephfs
LAST DEPLOYED: Sat Mar 22 2026 22:37:38
NAMESPACE: ceph-csi-cephfs
STATUS: deployed
REVISION: 1

Wait a moment for the pods to start, then verify the driver is running:

kubectl get pods -n ceph-csi-cephfs

You should see the nodeplugin DaemonSet pod and the provisioner Deployment pod both running:

NAME                                                READY   STATUS    RESTARTS   AGE
ceph-csi-cephfs-nodeplugin-cpz47                    3/3     Running   0          45s
ceph-csi-cephfs-provisioner-6647499dd6-ff67p        6/6     Running   0          45s

The nodeplugin runs on every node (DaemonSet) and handles mounting volumes to pods. The provisioner handles creating and deleting CephFS subvolumes when PVCs are created.

Single-node clusters: The provisioner Deployment has leader election configured for 3 replicas by default. On a single-node cluster (like a standalone k3s), scale it down to 1 replica:

kubectl scale deployment ceph-csi-cephfs-provisioner -n ceph-csi-cephfs --replicas=1

Step 4: Create Ceph Admin Secret

The ceph-csi driver needs Ceph credentials to authenticate with the cluster. Create a Kubernetes Secret with the admin key obtained in Step 1.

Create a file called csi-cephfs-secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: csi-cephfs-secret
  namespace: ceph-csi-cephfs
stringData:
  adminID: admin
  adminKey: AQAPT8BpV919LBAAnVPpCDQP0rbcR2qJNhpzag==

Apply the Secret:

kubectl apply -f csi-cephfs-secret.yaml

The Secret should be created in the same namespace as the ceph-csi driver:

secret/csi-cephfs-secret created

Production note: For production clusters, create a dedicated CephFS user with limited capabilities instead of using client.admin. The admin key has full cluster access, which violates the principle of least privilege.

Step 5: Create CephFS StorageClass

The StorageClass tells Kubernetes how to dynamically provision CephFS volumes. It references the ceph-csi provisioner, your cluster FSID, and the Secret created above.

Create a file called csi-cephfs-sc.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: csi-cephfs-sc
provisioner: cephfs.csi.ceph.com
parameters:
  clusterID: 41709c12-262c-11f1-a563-bc2411261faa
  fsName: cephfs
  mounter: fuse
  csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/provisioner-secret-namespace: ceph-csi-cephfs
  csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/controller-expand-secret-namespace: ceph-csi-cephfs
  csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/node-stage-secret-namespace: ceph-csi-cephfs
reclaimPolicy: Delete
allowVolumeExpansion: true

Apply the StorageClass:

kubectl apply -f csi-cephfs-sc.yaml

Verify it was created:

kubectl get storageclass csi-cephfs-sc

The StorageClass should appear with the ceph-csi provisioner:

NAME            PROVISIONER           RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
csi-cephfs-sc   cephfs.csi.ceph.com   Delete          Immediate           true                   5s

About the mounter parameter: Setting mounter: fuse tells ceph-csi to use ceph-fuse for mounting. This is the recommended approach because ceph-fuse runs inside the CSI container and does not require ceph-common to be installed on the Kubernetes host nodes. The kernel CephFS driver (mounter: kernel) offers better performance but requires ceph-common installed on every K8s node – which may not be available for all distributions (for example, EL10 does not ship ceph-common packages yet). To use kernel mount, remove the mounter line entirely or set it to kernel.

Step 6: Test with a PersistentVolumeClaim

Create a PVC to verify that dynamic provisioning works. The ceph-csi driver will create a CephFS subvolume on your Ceph cluster and bind it to this PVC.

Create a file called cephfs-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: csi-cephfs-sc
  resources:
    requests:
      storage: 1Gi

Apply and check the PVC status:

kubectl apply -f cephfs-pvc.yaml
kubectl get pvc cephfs-pvc

The PVC should transition to Bound status within a few seconds:

NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
cephfs-pvc   Bound    pvc-75f312be-77c7-4f7e-9c0d-bf2c7d23551e   1Gi        RWX            csi-cephfs-sc   5s

Notice the access mode is RWX (ReadWriteMany) – this means multiple pods can mount this volume simultaneously for both reading and writing. This is one of the key advantages of CephFS over Ceph RBD block storage, which only supports ReadWriteOnce.

Step 7: Deploy a Test Pod with CephFS Volume

Mount the CephFS PVC into a pod and verify it works. This test pod writes a file to the volume, displays it, and shows the filesystem mount details.

Create a file called cephfs-test-app.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: cephfs-test-app
spec:
  containers:
  - name: app
    image: busybox:latest
    command: ['/bin/sh', '-c']
    args:
      - |
        echo 'CephFS is working from Kubernetes!' > /data/test.txt
        date >> /data/test.txt
        hostname >> /data/test.txt
        cat /data/test.txt
        echo "---"
        df -h /data
        echo "---"
        mount | grep /data
        sleep 3600
    volumeMounts:
    - mountPath: /data
      name: cephfs-vol
  volumes:
  - name: cephfs-vol
    persistentVolumeClaim:
      claimName: cephfs-pvc

Apply the pod and wait for it to start:

kubectl apply -f cephfs-test-app.yaml
kubectl wait --for=condition=Ready pod/cephfs-test-app --timeout=60s

Check the pod logs to confirm CephFS is mounted and working:

kubectl logs cephfs-test-app

The output confirms the file was written and the ceph-fuse mount is active:

CephFS is working from Kubernetes!
Sun Mar 22 23:01:46 UTC 2026
cephfs-test-app
---
Filesystem                Size      Used Available Use% Mounted on
ceph-fuse                 1.0G         0      1.0G   0% /data
---
ceph-fuse on /data type fuse.ceph-fuse (rw,relatime,user_id=0,group_id=0,allow_other)

The mount type fuse.ceph-fuse confirms the volume is using the FUSE mounter as configured in the StorageClass. The quota is correctly enforced at 1 GiB.

Step 8: ReadWriteMany – Multiple Pods Sharing One Volume

The real power of CephFS is ReadWriteMany (RWX) access. Multiple pods can mount the same PVC and read/write files simultaneously. This is essential for workloads like web servers sharing content, shared log directories, or collaborative data processing.

Create a second pod that writes to the same PVC. Create a file called cephfs-writer.yaml:

apiVersion: v1
kind: Pod
metadata:
  name: cephfs-writer
spec:
  containers:
  - name: writer
    image: busybox:latest
    command: ['/bin/sh', '-c']
    args:
      - |
        echo "Written by cephfs-writer pod" > /data/from-writer.txt
        date >> /data/from-writer.txt
        echo "Writer done. Files in /data:"
        ls -la /data/
        sleep 3600
    volumeMounts:
    - mountPath: /data
      name: cephfs-vol
  volumes:
  - name: cephfs-vol
    persistentVolumeClaim:
      claimName: cephfs-pvc

Apply the writer pod:

kubectl apply -f cephfs-writer.yaml
kubectl wait --for=condition=Ready pod/cephfs-writer --timeout=60s

Check that the writer pod can see the file created by the first pod, and has written its own file:

kubectl logs cephfs-writer

The writer sees both files – its own and the one from the test app pod:

Writer done. Files in /data:
total 2
drwxrwxrwx    1 root     root             2 Mar 22 23:06 .
drwxr-xr-x    1 root     root          4096 Mar 22 23:06 ..
-rw-r--r--    1 root     root            56 Mar 22 23:06 from-writer.txt
-rw-r--r--    1 root     root            65 Mar 22 23:01 test.txt

Now verify the first pod can read the writer’s file:

kubectl exec cephfs-test-app -- cat /data/from-writer.txt

The first pod reads the file written by the second pod:

Written by cephfs-writer pod
Sun Mar 22 23:06:44 UTC 2026

Both pods share the same CephFS subvolume and see each other’s changes in real time. This is exactly what makes CephFS valuable for Kubernetes workloads that need shared filesystem access.

Step 9: Online Volume Expansion

The StorageClass has allowVolumeExpansion: true, which means you can grow CephFS volumes without downtime. The quota is updated on the Ceph side and the mount reflects the new size immediately.

Check the current size from inside the pod:

kubectl exec cephfs-test-app -- df -h /data

Current volume is 1 GiB:

Filesystem                Size      Used Available Use% Mounted on
ceph-fuse                 1.0G         0      1.0G   0% /data

Patch the PVC to expand from 1 GiB to 2 GiB:

kubectl patch pvc cephfs-pvc -p '{"spec":{"resources":{"requests":{"storage":"2Gi"}}}}'

The PVC is patched immediately:

persistentvolumeclaim/cephfs-pvc patched

Wait a few seconds, then verify the volume now shows 2 GiB:

kubectl exec cephfs-test-app -- df -h /data

The volume reflects the expanded size without any pod restart:

Filesystem                Size      Used Available Use% Mounted on
ceph-fuse                 2.0G         0      2.0G   0% /data

Online expansion works because CephFS uses directory quotas rather than fixed block allocations. The ceph-csi driver simply updates the quota on the subvolume.

Step 10: Volume Snapshots and Restore

CephFS snapshots let you capture a point-in-time copy of a volume and restore from it later. This requires the Kubernetes snapshot CRDs and snapshot controller to be installed.

Install Snapshot CRDs and Controller

If your cluster does not already have the snapshot CRDs, install them. These are cluster-wide resources required once per cluster.

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotclasses.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshotcontents.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/client/config/crd/snapshot.storage.k8s.io_volumesnapshots.yaml

Install the snapshot controller in the kube-system namespace:

kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/rbac-snapshot-controller.yaml
kubectl apply -f https://raw.githubusercontent.com/kubernetes-csi/external-snapshotter/master/deploy/kubernetes/snapshot-controller/setup-snapshot-controller.yaml

Create a VolumeSnapshotClass

The VolumeSnapshotClass defines how snapshots are taken for CephFS volumes. Create a file called cephfs-snapclass.yaml:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-cephfs-snapclass
driver: cephfs.csi.ceph.com
parameters:
  clusterID: 41709c12-262c-11f1-a563-bc2411261faa
  csi.storage.k8s.io/snapshotter-secret-name: csi-cephfs-secret
  csi.storage.k8s.io/snapshotter-secret-namespace: ceph-csi-cephfs
deletionPolicy: Delete

Apply it:

kubectl apply -f cephfs-snapclass.yaml

Take a Snapshot

First, write a file that we will later delete and recover from the snapshot:

kubectl exec cephfs-test-app -- sh -c 'echo "Important data - do not lose" > /data/critical.txt'

Now create a VolumeSnapshot. Create a file called cephfs-snapshot.yaml:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: cephfs-snapshot
spec:
  volumeSnapshotClassName: csi-cephfs-snapclass
  source:
    persistentVolumeClaimName: cephfs-pvc

Apply and check the snapshot status:

kubectl apply -f cephfs-snapshot.yaml
kubectl get volumesnapshot cephfs-snapshot

The snapshot should show readyToUse as true:

NAME              READYTOUSE   SOURCEPVC    RESTORESIZE   SNAPSHOTCLASS          AGE
cephfs-snapshot   true         cephfs-pvc   2Gi           csi-cephfs-snapclass   15s

Restore from Snapshot

Simulate data loss by deleting the critical file:

kubectl exec cephfs-test-app -- rm /data/critical.txt
kubectl exec cephfs-test-app -- ls /data/critical.txt

The file is gone:

ls: /data/critical.txt: No such file or directory

Create a new PVC restored from the snapshot. Create a file called cephfs-restored-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: cephfs-restored-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: csi-cephfs-sc
  resources:
    requests:
      storage: 2Gi
  dataSource:
    name: cephfs-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io

Apply and verify the restored PVC is bound:

kubectl apply -f cephfs-restored-pvc.yaml
kubectl get pvc cephfs-restored-pvc

The restored PVC binds to a new PersistentVolume:

NAME                  STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE
cephfs-restored-pvc   Bound    pvc-a3b912f4-8820-4e6c-a7d1-c9123fe87d12   2Gi        RWX            csi-cephfs-sc   8s

Mount the restored volume in a recovery pod to confirm the deleted file is there:

kubectl run restore-check --image=busybox:latest --restart=Never --overrides='
{
  "spec": {
    "containers": [{
      "name": "restore-check",
      "image": "busybox:latest",
      "command": ["cat", "/data/critical.txt"],
      "volumeMounts": [{"mountPath": "/data", "name": "restored"}]
    }],
    "volumes": [{
      "name": "restored",
      "persistentVolumeClaim": {"claimName": "cephfs-restored-pvc"}
    }]
  }
}'

Check the output:

kubectl logs restore-check

The deleted file is recovered from the snapshot:

Important data - do not lose

Snapshots are fast because CephFS uses copy-on-write. They consume minimal extra space initially and only grow as the original data diverges from the snapshot point.

Step 11: Nginx Deployment with Shared CephFS Storage

For a more realistic example, deploy Nginx with multiple replicas sharing a CephFS volume for web content. This demonstrates how RWX storage enables stateful, horizontally-scaled web applications.

Create a file called nginx-cephfs.yaml with a PVC and Deployment:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nginx-cephfs-pvc
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: csi-cephfs-sc
  resources:
    requests:
      storage: 5Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-cephfs
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-cephfs
  template:
    metadata:
      labels:
        app: nginx-cephfs
    spec:
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
        volumeMounts:
        - mountPath: /usr/share/nginx/html
          name: web-content
      volumes:
      - name: web-content
        persistentVolumeClaim:
          claimName: nginx-cephfs-pvc

Apply the manifest:

kubectl apply -f nginx-cephfs.yaml

Wait for the pods to be ready:

kubectl get pods -l app=nginx-cephfs

Both replicas should be running and sharing the same CephFS volume:

NAME                            READY   STATUS    RESTARTS   AGE
nginx-cephfs-6d4b9bd465-s8tsg   1/1     Running   0          25s
nginx-cephfs-6d4b9bd465-w6lt5   1/1     Running   0          25s

Write an HTML file from one pod – both replicas will serve the same content because they share the CephFS volume:

POD1=$(kubectl get pods -l app=nginx-cephfs -o jsonpath='{.items[0].metadata.name}')
kubectl exec $POD1 -- sh -c 'echo "

Served from CephFS!

" > /usr/share/nginx/html/index.html'

Verify both pods serve the content by curling each one:

for POD in $(kubectl get pods -l app=nginx-cephfs -o jsonpath='{.items[*].metadata.name}'); do
  echo "--- $POD ---"
  kubectl exec $POD -- curl -s localhost
done

Both replicas return the same content from the shared CephFS volume:

--- nginx-cephfs-6d4b9bd465-s8tsg ---

Served from CephFS!

--- nginx-cephfs-6d4b9bd465-w6lt5 ---

Served from CephFS!

This pattern works well for any application where multiple pods need access to the same files – content management systems, shared upload directories, machine learning datasets, or log aggregation.

Verify CephFS Subvolumes on the Ceph Side

You can verify what ceph-csi has created by listing subvolumes on the Ceph cluster. Run this on any Ceph node:

ceph fs subvolume ls cephfs --group_name csi

This lists all subvolumes managed by the ceph-csi driver. You should see one subvolume per bound PVC:

[
    {
        "name": "csi-vol-75f312be-77c7-4f7e-9c0d-bf2c7d23551e"
    },
    {
        "name": "csi-vol-a3b912f4-8820-4e6c-a7d1-c9123fe87d12"
    },
    {
        "name": "csi-vol-e9f4a821-3c1d-4b8f-a912-5d7c8e6f1234"
    }
]

Each subvolume name maps to a PersistentVolume in Kubernetes. The ceph-csi driver handles the full lifecycle – creation, mounting, expansion, snapshotting, and deletion – all driven by standard Kubernetes APIs.

Firewall and Network Requirements

For ceph-csi to work, every Kubernetes node must be able to reach the Ceph monitors. Open these ports on any firewalls between K8s nodes and Ceph nodes:

PortProtocolPurpose
6789TCPCeph Monitor (v1 messenger)
3300TCPCeph Monitor (v2 messenger)
6800-7300TCPCeph OSD data transfer

If you use firewalld on the Kubernetes nodes, open the required ports:

sudo firewall-cmd --permanent --add-port=6789/tcp --add-port=3300/tcp --add-port=6800-7300/tcp
sudo firewall-cmd --reload

On the Ceph side, ensure the monitor and OSD ports are accessible from the Kubernetes node subnet. If both clusters are on the same network (like our 192.168.1.0/24 setup), no additional firewall changes may be needed.

Conclusion

CephFS with ceph-csi gives Kubernetes workloads a production-grade shared filesystem. You now have dynamic provisioning, ReadWriteMany access across pods, online volume expansion, and point-in-time snapshots with restore – all managed through standard Kubernetes resources.

For production deployments, create a dedicated CephFS client user with minimal capabilities instead of using the admin key, set the reclaim policy to Retain for critical data, schedule regular snapshot backups, and monitor subvolume usage through Ceph’s built-in metrics. If your Kubernetes nodes have ceph-common installed, switch from fuse to kernel mounter for better performance.

Related Articles

Containers Deploy and Use OpenEBS Container Storage on Kubernetes Kubernetes Install Kubernetes Dashboard on Any K8s Cluster Containers Managing Docker Containers with Docker Compose Containers Configure Chrony NTP on OpenShift / OKD 4.x

Leave a Comment

Press ESC to close