Containers

etcd Backup and Restore for Kubernetes Disaster Recovery

Your etcd cluster just lost quorum at 3am. The Kubernetes API server won’t respond, kubectl hangs, and your on-call engineer is staring at a terminal with no backup to restore from. Every pod definition, every secret, every configmap, every RBAC policy exists only in etcd. Without a snapshot, you’re rebuilding the entire cluster from scratch.

Original content from computingforgeeks.com - post 165258

This guide covers the complete etcd backup and restore workflow for Kubernetes disaster recovery. We’ll take snapshots, verify them, automate the process with systemd timers, store backups off-site, and walk through a full multi-node restore on a 3-member stacked etcd cluster. If you run a highly available Kubernetes cluster, this is not optional reading.

Current as of April 2026. Verified on Ubuntu 24.04.4 LTS with Kubernetes 1.35.3, etcd 3.6.10, containerd 2.2.2

How etcd Stores Kubernetes State

etcd is a distributed key-value store that holds the entire state of your Kubernetes cluster. Every object you create with kubectl (pods, deployments, services, secrets, configmaps, namespaces, RBAC rules) gets serialized and stored as a key in etcd. The API server is essentially a CRUD interface sitting in front of etcd.

etcd uses Multi-Version Concurrency Control (MVCC), which means it keeps a history of every change as numbered revisions. A cluster with 4,820 revisions has had that many write operations since it was created. This revision history is what makes etcd’s database grow over time and why periodic defragmentation matters. The official Kubernetes etcd administration docs recommend regular backups as part of any production setup.

Lose etcd, lose everything. That’s the short version.

Prerequisites

This guide assumes:

  • A running Kubernetes HA cluster with stacked etcd (3 control plane nodes). If you don’t have one yet, follow our kubeadm HA cluster setup guide
  • Tested on: Kubernetes v1.35.3, etcd 3.6.10, Ubuntu 24.04.4 LTS (kernel 6.8.0-101-generic)
  • etcdctl and etcdutl installed on all control plane nodes
  • Root or sudo access on all control plane nodes
  • Familiarity with kubectl basics

The etcdctl and etcdutl binaries ship with etcd but aren’t always on the host PATH when etcd runs inside a static pod. Install them on each control plane node:

ETCD_VER=v3.6.10 #https://github.com/etcd-io/etcd/releases
curl -sL https://github.com/etcd-io/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz | sudo tar xz -C /usr/local/bin --strip-components=1 etcd-${ETCD_VER}-linux-amd64/etcdctl etcd-${ETCD_VER}-linux-amd64/etcdutl

Confirm both tools are available:

etcdctl version
etcdutl version

You should see version reported by both:

etcdctl version: 3.6.10
API version: 3.6

etcdutl version: 3.6.10
API version: 3.6

Throughout this guide, we use a 3-node control plane cluster with these IPs:

  • cp1: 10.0.1.10
  • cp2: 10.0.1.11
  • cp3: 10.0.1.12

Set an alias to avoid typing certificate paths repeatedly. The certs live in /etc/kubernetes/pki/etcd/ on kubeadm clusters:

export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key

Add these to your shell profile on each control plane node so they persist across sessions.

Take an etcd Snapshot

An etcd snapshot captures the entire key-value store at a point in time. Run this on any single control plane node (etcd replicates data across all members, so one snapshot contains everything):

sudo etcdctl snapshot save /tmp/etcd-snapshot-$(date +%Y%m%d-%H%M%S).db

The save operation completes in under a second on most clusters:

Snapshot saved at /tmp/etcd-snapshot-20260403-091542.db

Now verify the snapshot’s integrity with etcdutl. Note that snapshot verification uses etcdutl, not etcdctl, because it operates on the file directly rather than through the etcd API:

sudo etcdutl snapshot status /tmp/etcd-snapshot-20260403-091542.db --write-out=table

The output shows the snapshot metadata:

+---------+----------+------------+------------+------------------+
|  HASH   | REVISION | TOTAL KEYS | TOTAL SIZE | STORAGE VERSION  |
+---------+----------+------------+------------+------------------+
| 8356a85e|     4820 |        795 |    11 MB   |      3.6.0       |
+---------+----------+------------+------------+------------------+

That’s 795 keys across 4,820 revisions in an 11 MB snapshot. The hash value (8356a85e) is what you’ll use to verify the snapshot hasn’t been corrupted during transfer to off-site storage.

Verify Cluster Health Before Backup

Before you trust a snapshot, confirm that the etcd cluster is healthy. A snapshot from an unhealthy member might contain inconsistent data.

sudo etcdctl endpoint health --cluster --write-out=table

All three members should report healthy:

+---------------------------+--------+------------+-------+
|         ENDPOINT          | HEALTH |    TOOK    | ERROR |
+---------------------------+--------+------------+-------+
| https://10.0.1.10:2379    |   true |  8.4211ms  |       |
| https://10.0.1.11:2379    |   true |  9.1035ms  |       |
| https://10.0.1.12:2379    |   true |  8.7628ms  |       |
+---------------------------+--------+------------+-------+

Check the detailed endpoint status to see database size, leader election, and raft index:

sudo etcdctl endpoint status --cluster --write-out=table

This shows the current state of each member:

+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| https://10.0.1.10:2379    | 4e205a1847c6c633 |  3.6.6  |   11 MB |      true |      false |         4 |       5091 |               5091 |        |
| https://10.0.1.11:2379    | 7a3c98f2b10d4a1e |  3.6.6  |   11 MB |     false |      false |         4 |       5091 |               5091 |        |
| https://10.0.1.12:2379    | 2d9f6c41e8a35b72 |  3.6.6  |   11 MB |     false |      false |         4 |       5091 |               5091 |        |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

All members show the same raft index (5091), which means they’re fully synchronized. The leader is 10.0.1.10 in this cluster. If any member shows a significantly lower raft index or reports errors, investigate before taking a backup.

Defragment etcd

etcd’s MVCC engine keeps old revisions until compaction removes them, but compaction only marks space as free. It doesn’t return that space to the filesystem. Defragmentation reclaims it. On a production cluster running for weeks, the database file can be significantly larger than the actual live data. This matters because etcd has a default space quota of 2 GB (8 GB max), and exceeding it makes the cluster read-only.

Run defrag across all members:

sudo etcdctl defrag --cluster

Each member is defragmented sequentially:

Finished defragmenting etcd member[https://10.0.1.10:2379] (125ms)
Finished defragmenting etcd member[https://10.0.1.11:2379] (119ms)
Finished defragmenting etcd member[https://10.0.1.12:2379] (122ms)

On our test cluster the defrag completed in roughly 125ms per member. A heavily used production cluster with a larger database will take longer. The operation briefly blocks writes on each member as it processes, so schedule it during low-traffic windows.

Store Backups Off-Site

A snapshot sitting on the same server as etcd isn’t a real backup. If the node’s disk fails, you lose both. Push snapshots to object storage immediately after creation.

AWS S3

Upload the snapshot to an S3 bucket with server-side encryption:

aws s3 cp /tmp/etcd-snapshot-20260403-091542.db \
  s3://my-k8s-backups/etcd/etcd-snapshot-20260403-091542.db \
  --sse AES256

Verify the backup by downloading it to a different location and checking the hash:

aws s3 cp s3://my-k8s-backups/etcd/etcd-snapshot-20260403-091542.db /tmp/etcd-verify.db
sudo etcdutl snapshot status /tmp/etcd-verify.db --write-out=table

The hash in the output must match 8356a85e from the original snapshot. If it differs, the file was corrupted during transfer.

Google Cloud Storage

For GCP environments, use gsutil:

gsutil cp /tmp/etcd-snapshot-20260403-091542.db \
  gs://my-k8s-backups/etcd/etcd-snapshot-20260403-091542.db

Same verification applies. Download, check the hash, confirm it matches. Trust but verify.

Automate Backups with systemd

Manual backups don’t happen. The one time you forget is the time you’ll need a restore. Use a systemd timer to take snapshots every 6 hours and push them to object storage.

Create the backup script:

sudo vi /usr/local/bin/etcd-backup.sh

Add the following:

#!/bin/bash
set -euo pipefail

BACKUP_DIR="/var/backups/etcd"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
SNAPSHOT="${BACKUP_DIR}/etcd-snapshot-${TIMESTAMP}.db"
S3_BUCKET="s3://my-k8s-backups/etcd"
RETENTION_DAYS=7

export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://127.0.0.1:2379
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key

mkdir -p "${BACKUP_DIR}"

# Take snapshot
etcdctl snapshot save "${SNAPSHOT}"

# Verify snapshot integrity
etcdutl snapshot status "${SNAPSHOT}" --write-out=json | python3 -c "
import sys, json
data = json.load(sys.stdin)
print(f'Snapshot OK: {data[\"totalKey\"]} keys, revision {data[\"revision\"]}')
"

# Upload to S3
aws s3 cp "${SNAPSHOT}" "${S3_BUCKET}/$(basename ${SNAPSHOT})" --sse AES256

# Clean up old local snapshots
find "${BACKUP_DIR}" -name "etcd-snapshot-*.db" -mtime +${RETENTION_DAYS} -delete

echo "Backup complete: ${SNAPSHOT}"

Make it executable:

sudo chmod +x /usr/local/bin/etcd-backup.sh

Create the systemd service unit:

sudo vi /etc/systemd/system/etcd-backup.service

Add this configuration:

[Unit]
Description=etcd snapshot backup
After=network.target

[Service]
Type=oneshot
ExecStart=/usr/local/bin/etcd-backup.sh
User=root
StandardOutput=journal
StandardError=journal

Create the timer to trigger every 6 hours:

sudo vi /etc/systemd/system/etcd-backup.timer

Timer configuration:

[Unit]
Description=Run etcd backup every 6 hours

[Timer]
OnBootSec=15min
OnUnitActiveSec=6h
RandomizedDelaySec=5min
Persistent=true

[Install]
WantedBy=timers.target

Enable and start the timer:

sudo systemctl daemon-reload
sudo systemctl enable --now etcd-backup.timer

Verify the timer is active and check when the next run is scheduled:

sudo systemctl list-timers etcd-backup.timer

The output confirms the schedule:

NEXT                         LEFT          LAST   PASSED   UNIT               ACTIVATES
Thu 2026-04-03 15:18:42 UTC  5h 47min left  n/a    n/a     etcd-backup.timer  etcd-backup.service

Test a manual run to confirm everything works end to end:

sudo systemctl start etcd-backup.service
sudo journalctl -u etcd-backup.service --no-pager -l

You should see the snapshot creation, verification, and S3 upload complete without errors.

Full Disaster Recovery Restore

This is the part nobody wants to practice, which is exactly why you should. A full etcd restore means stopping etcd on every control plane node, wiping the existing data directory, restoring from snapshot, and restarting the cluster. On a 3-node stacked etcd cluster, every node must be restored with matching --initial-cluster configuration or the members won’t find each other.

The etcd recovery documentation covers the theory. Here’s the exact procedure we tested.

Step 1: Copy the Snapshot to All Control Plane Nodes

Place the snapshot file on all three nodes. From wherever the backup is stored:

for node in 10.0.1.10 10.0.1.11 10.0.1.12; do
  scp /tmp/etcd-snapshot-20260403-091542.db root@${node}:/tmp/etcd-restore.db
done

Step 2: Stop kube-apiserver and etcd on All Nodes

On kubeadm clusters, both kube-apiserver and etcd run as static pods managed by the kubelet. Moving their manifests out of /etc/kubernetes/manifests/ stops them immediately. Run this on each control plane node:

sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/

Wait a few seconds, then confirm both containers have stopped:

sudo crictl ps | grep -E "etcd|kube-apiserver"

No output means both are down. If containers are still listed, wait 10 more seconds and check again.

Step 3: Back Up and Clean the Data Directory

On each node, move the existing etcd data aside (never delete it until the restore is verified):

sudo mv /var/lib/etcd/member /var/lib/etcd/member.bak.$(date +%Y%m%d)

Step 4: Restore the Snapshot on Each Node

This is where most people get it wrong. Each node must run etcdutl snapshot restore with its own identity and the complete initial cluster configuration. The --name and --initial-advertise-peer-urls must match that specific node, while --initial-cluster is the same on all three.

On cp1 (10.0.1.10):

sudo etcdutl snapshot restore /tmp/etcd-restore.db \
  --name cp1 \
  --data-dir /var/lib/etcd \
  --initial-cluster cp1=https://10.0.1.10:2380,cp2=https://10.0.1.11:2380,cp3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.1.10:2380

On cp2 (10.0.1.11):

sudo etcdutl snapshot restore /tmp/etcd-restore.db \
  --name cp2 \
  --data-dir /var/lib/etcd \
  --initial-cluster cp1=https://10.0.1.10:2380,cp2=https://10.0.1.11:2380,cp3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.1.11:2380

On cp3 (10.0.1.12):

sudo etcdutl snapshot restore /tmp/etcd-restore.db \
  --name cp3 \
  --data-dir /var/lib/etcd \
  --initial-cluster cp1=https://10.0.1.10:2380,cp2=https://10.0.1.11:2380,cp3=https://10.0.1.12:2380 \
  --initial-cluster-token etcd-cluster-1 \
  --initial-advertise-peer-urls https://10.0.1.12:2380

Each restore should output a confirmation message with the member name and data directory. The --initial-cluster-token value must be different from the original cluster’s token to prevent the restored nodes from accidentally joining the old cluster.

Fix ownership on the restored data directory (etcd runs as root in kubeadm static pods, but verify your setup):

sudo chown -R root:root /var/lib/etcd

Step 5: Update etcd Configuration

The etcd static pod manifest needs the --initial-cluster-token updated to match what you used during restore. Edit the manifest before moving it back:

sudo sed -i 's/--initial-cluster-token=.*/--initial-cluster-token=etcd-cluster-1/' /tmp/etcd.yaml

Step 6: Restore Manifests and Start Services

Move the manifests back on each node. Start with etcd, wait for it to form quorum, then restore the API server:

sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/etcd.yaml

Wait 30 seconds for etcd to start and elect a leader. Then restore the API server:

sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml

Check that both containers are running:

sudo crictl ps | grep -E "etcd|kube-apiserver"

Both should appear with STATUS “Running” and an age of less than a minute.

Verify Restored Data

This is the moment of truth. The cluster is back, but does it actually have your data?

Check that all namespaces are present:

kubectl get ns

Both production and staging namespaces should be back:

NAME              STATUS   AGE
default           Active   47d
kube-node-lease   Active   47d
kube-public       Active   47d
kube-system       Active   47d
production        Active   47d
staging           Active   47d

List the workloads in the production namespace:

kubectl get deploy,svc,configmap,secret -n production

All resources should match what existed at snapshot time:

NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/web-app     3/3     3            3           47d
deployment.apps/api-server  2/2     2            2           47d
deployment.apps/worker      2/2     2            2           47d

NAME                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/web-app       ClusterIP   10.96.142.18    <none>        80/TCP     47d
service/api-server    ClusterIP   10.96.88.205    <none>        8080/TCP   47d

NAME                          DATA   AGE
configmap/app-config          4      47d
configmap/kube-root-ca.crt    1      47d

NAME                          TYPE     DATA   AGE
secret/db-credentials         Opaque   3      47d
secret/api-keys               Opaque   2      47d

Verify that ConfigMap data is intact, not just that the object exists:

kubectl get configmap app-config -n production -o yaml

The actual key-value data should be present, not empty. Same for Secrets (they’ll be base64-encoded but present):

kubectl get secret db-credentials -n production -o jsonpath='{.data}' | python3 -m json.tool

If the Secret data is there, the restore is complete. Every key that was in etcd at snapshot time is back.

Confirm etcd cluster health post-restore:

sudo etcdctl endpoint health --cluster --write-out=table

All three members healthy:

+---------------------------+--------+------------+-------+
|         ENDPOINT          | HEALTH |    TOOK    | ERROR |
+---------------------------+--------+------------+-------+
| https://10.0.1.10:2379    |   true |  7.9104ms  |       |
| https://10.0.1.11:2379    |   true |  8.3521ms  |       |
| https://10.0.1.12:2379    |   true |  8.1089ms  |       |
+---------------------------+--------+------------+-------+

Clean up the old data directory backup once you’re satisfied:

sudo rm -rf /var/lib/etcd/member.bak.*

Velero as an Alternative Approach

etcd snapshots and Velero solve different problems, and most production environments benefit from running both.

etcd snapshots capture the entire cluster state at the storage layer. One snapshot, one restore, everything comes back: namespaces, RBAC, CRDs, network policies, the lot. The downside is that it’s all or nothing. You can’t restore a single namespace from an etcd snapshot without restoring the whole cluster.

Velero operates at the Kubernetes API level. It backs up individual namespaces, specific resource types, or label-selected objects. Need to restore just the production namespace after someone ran kubectl delete ns production? Velero handles that without touching etcd or disrupting other workloads. It also captures persistent volume snapshots through CSI, which etcd snapshots don’t include (etcd stores the PV/PVC objects but not the actual data on the volumes).

Use etcd snapshots for full cluster disaster recovery. Use Velero for namespace-level backup, migration between clusters, and PV data protection. Running both gives you coverage at two different layers.

Backup Strategy for Production

Having backups is step one. Having a strategy that actually protects you is the real work.

Frequency: Take snapshots every 2 to 6 hours depending on your cluster’s change rate. A cluster with frequent deployments and config changes needs more frequent snapshots. The systemd timer above runs every 6 hours, which is a reasonable default. For clusters processing hundreds of deployments daily, drop that to every 2 hours.

Retention: Keep 7 days of snapshots on local disk and 30 days in object storage. S3 lifecycle policies can automatically expire old snapshots. Set up versioning on your backup bucket as a safety net against accidental deletion.

Test your restores. Quarterly at minimum. A backup you’ve never tested is just a file that makes you feel safe. Spin up a separate set of VMs, restore the snapshot, and verify data integrity. Document the time it takes so you have realistic RTO expectations. Our test restore on a 3-node cluster with an 11 MB snapshot took under 5 minutes end to end.

Monitor backup freshness. If your backup job silently fails for a week, you won’t know until you need the backup. Export a metric from your backup script (last successful backup timestamp, snapshot size, key count) and alert on it. If you’re running Prometheus and Grafana on your cluster, create an alert rule that fires when the last backup is older than 12 hours.

A simple Prometheus metric from the backup script:

cat <<PROM > /var/lib/node_exporter/etcd_backup.prom
etcd_backup_last_success_timestamp $(date +%s)
etcd_backup_snapshot_size_bytes $(stat -c%s "${SNAPSHOT}")
etcd_backup_total_keys $(etcdutl snapshot status "${SNAPSHOT}" --write-out=json | python3 -c "import sys,json; print(json.load(sys.stdin)['totalKey'])")
PROM

Add those lines to the end of your backup script, and Prometheus picks them up via the node exporter’s textfile collector.

Migrating etcd Backup Practices Between Cluster Versions

If you’re upgrading from an older Kubernetes version, there are a few changes to be aware of regarding etcd backup tooling.

In etcd 3.5 and earlier, snapshot operations used etcdctl snapshot save and etcdctl snapshot restore. Starting with etcd 3.6 (which ships with Kubernetes 1.33+), the restore operation moved to etcdutl snapshot restore. The save operation still works with etcdctl, but restore is now etcdutl only. If you have existing backup scripts that use etcdctl snapshot restore, update them when you upgrade to Kubernetes 1.33 or later.

Snapshot format is forward-compatible. A snapshot taken with etcd 3.5 can be restored on etcd 3.6. However, you cannot take a 3.6 snapshot and restore it on a 3.5 cluster because of the storage version differences. Plan your backup retention accordingly during upgrades: keep pre-upgrade snapshots until you’re sure the upgrade is stable.

The --initial-cluster-token flag behavior also changed slightly in 3.6. It’s now mandatory during restore to prevent accidental cross-cluster communication. Older versions were more lenient about omitting it. Always include it explicitly.

For clusters using external etcd (not stacked with kubeadm), the restore procedure differs because the etcd members aren’t managed by the kubelet. You’ll stop the etcd systemd service directly instead of moving static pod manifests. The snapshot restore command and flags remain identical.

If you’re running lightweight distributions like K3s, note that K3s uses an embedded SQLite or Dqlite database by default, not etcd. K3s does support external etcd as a datastore, in which case the procedures in this guide apply. For the embedded database, K3s has its own k3s etcd-snapshot command.

Network policies, Cilium CNI configuration, and custom resource definitions are all stored in etcd and will be restored with the snapshot. However, any state that lives outside etcd (persistent volume data, container images in registries, external DNS records) will not be part of the restore. Make sure your disaster recovery plan covers those layers separately.

Related Articles

Containers How To Install Docker Engine on Debian 12 (Bookworm) Containers Install and Use LXD on on Rocky / AlmaLinux 8|9 Containers Install Docker and Docker Compose on Kali Linux Kubernetes Install Kubernetes Cluster using k0s on Rocky Linux 9

Leave a Comment

Press ESC to close