RKE2 High Availability: 3-Node Cluster on Rocky Linux 10

A single RKE2 server node works fine for labs and development, but production workloads need something more resilient. If that one node goes down, your entire cluster goes with it. The fix is straightforward: run three server nodes behind a load balancer so etcd maintains quorum and the Kubernetes API stays reachable even when a node fails.

Original content from computingforgeeks.com - post 165137

This guide walks through building a 3-node RKE2 high availability cluster on Rocky Linux 10 with an external Nginx load balancer. Each server node runs the full control plane (API server, etcd, scheduler, controller manager), and the load balancer distributes traffic across all three. This is the architecture Rancher recommends for production RKE2 deployments. If you are new to RKE2, start with our single-node RKE2 guide on Rocky Linux first to understand the basics before scaling to HA.

Verified working: March 2026 on Rocky Linux 10.1 (kernel 6.12), RKE2 v1.35.3+rke2r1, 3-node etcd cluster, SELinux enforcing

What You Need

This setup requires four servers. Three Rocky Linux 10 nodes form the RKE2 cluster, and one Ubuntu node runs the Nginx load balancer.

Hostname	IP Address	OS	Role	Specs
rke2-lb	10.0.1.10	Ubuntu 24.04	Load Balancer	1 CPU, 1 GB RAM
rke2-ha-1	10.0.1.11	Rocky Linux 10.1	Server (control plane + etcd)	2 CPU, 4 GB RAM
rke2-ha-2	10.0.1.12	Rocky Linux 10.1	Server (control plane + etcd)	2 CPU, 4 GB RAM
rke2-ha-3	10.0.1.13	Rocky Linux 10.1	Server (control plane + etcd)	2 CPU, 4 GB RAM

Network requirements:

All four nodes can reach each other on the private network
Port 6443/tcp (Kubernetes API) open between all nodes and the LB
Port 9345/tcp (RKE2 node registration) open between server nodes and the LB
Ports 2379-2380/tcp (etcd) open between the three server nodes
Port 8472/udp (Canal/VXLAN) open between all server nodes
Root or sudo access on all servers

Architecture Overview

The Nginx load balancer sits in front of the three RKE2 server nodes and handles two types of traffic. Kubernetes API requests on port 6443 get distributed across all three API servers, and RKE2 registration requests on port 9345 reach whichever node is available. This is a Layer 4 (TCP) load balancer, not HTTP, because the Kubernetes API uses its own TLS.

Each server node runs the complete control plane stack: kube-apiserver, etcd, kube-scheduler, and kube-controller-manager. The three etcd instances form a cluster that tolerates one node failure while maintaining quorum. If any single server goes down, the remaining two continue serving the API and scheduling workloads.

Set Up the Load Balancer

The load balancer node runs Ubuntu 24.04 with Nginx configured for TCP stream proxying. This is not a typical HTTP reverse proxy setup. You need the ngx_stream_module to proxy raw TCP connections.

Install Nginx and the stream module:

sudo apt update
sudo apt install -y nginx libnginx-mod-stream

Verify the stream module is available:

nginx -V 2>&1 | grep stream

You should see --with-stream in the configure arguments. Now create the stream configuration. Open the main Nginx config:

sudo vi /etc/nginx/nginx.conf

Make sure the load_module directive is at the very top of the file (before the events block), then add a stream block at the bottom, outside the http block:

load_module /usr/lib/nginx/modules/ngx_stream_module.so;

# ... existing events and http blocks stay as they are ...

stream {
    upstream rke2_api {
        least_conn;
        server 10.0.1.11:6443 max_fails=3 fail_timeout=5s;
        server 10.0.1.12:6443 max_fails=3 fail_timeout=5s;
        server 10.0.1.13:6443 max_fails=3 fail_timeout=5s;
    }

    upstream rke2_register {
        least_conn;
        server 10.0.1.11:9345 max_fails=3 fail_timeout=5s;
        server 10.0.1.12:9345 max_fails=3 fail_timeout=5s;
        server 10.0.1.13:9345 max_fails=3 fail_timeout=5s;
    }

    server {
        listen 6443;
        proxy_pass rke2_api;
    }

    server {
        listen 9345;
        proxy_pass rke2_register;
    }
}

Test the configuration and start Nginx:

sudo nginx -t
sudo systemctl enable --now nginx

Confirm that Nginx is listening on both ports:

ss -tlnp | grep -E '6443|9345'

The output confirms both ports are bound:

LISTEN 0      511          0.0.0.0:6443       0.0.0.0:*    users:(("nginx",pid=1234,fd=8))
LISTEN 0      511          0.0.0.0:9345       0.0.0.0:*    users:(("nginx",pid=1234,fd=9))

The load balancer is ready. It will start proxying once the RKE2 nodes come up.

Install RKE2 on All Server Nodes

Run these steps on all three Rocky Linux nodes (rke2-ha-1, rke2-ha-2, rke2-ha-3). The RKE2 installer is the same across all nodes.

Set the hostname on each node (adjust the name accordingly):

sudo hostnamectl set-hostname rke2-ha-1

Add host entries so the nodes can resolve each other by name. Open /etc/hosts:

sudo vi /etc/hosts

Add these lines on every node:

10.0.1.10  rke2-lb
10.0.1.11  rke2-ha-1
10.0.1.12  rke2-ha-2
10.0.1.13  rke2-ha-3

Open the required firewall ports on each Rocky node:

sudo firewall-cmd --permanent --add-port=6443/tcp
sudo firewall-cmd --permanent --add-port=9345/tcp
sudo firewall-cmd --permanent --add-port=2379-2380/tcp
sudo firewall-cmd --permanent --add-port=8472/udp
sudo firewall-cmd --permanent --add-port=10250/tcp
sudo firewall-cmd --permanent --add-port=10251/tcp
sudo firewall-cmd --permanent --add-port=10252/tcp
sudo firewall-cmd --reload

Install RKE2 using the official installer script:

curl -sfL https://get.rke2.io | sudo sh -

The installer downloads and places the RKE2 binary along with systemd unit files. It does not start the service yet.

Kernel Reboot (Rocky Linux 10 Gotcha)

This catches most people off guard. Rocky Linux 10 minimal and cloud images ship with a kernel that lacks the br_netfilter module needed by the Canal CNI. The RKE2 installer pulls in a newer kernel as a dependency, but the system is still running the old one. Without a reboot, Canal pods will crash-loop with netfilter errors.

Reboot each node after the install completes:

sudo reboot

After the node comes back, confirm the new kernel is active:

uname -r

You should see the updated kernel version:

6.12.0-124.40.1.el10_1.x86_64

Verify br_netfilter loads correctly:

sudo modprobe br_netfilter
lsmod | grep br_netfilter

The module should appear in the output:

br_netfilter           32768  0
bridge                421888  1 br_netfilter

Repeat the RKE2 install and reboot on all three nodes before proceeding. Do not start the RKE2 service until all nodes have been rebooted on the new kernel.

Initialize the First Server Node

The first node bootstraps etcd and generates the cluster token that the other nodes use to join. SSH into rke2-ha-1 and create the RKE2 configuration directory:

sudo mkdir -p /etc/rancher/rke2

Create the configuration file:

sudo vi /etc/rancher/rke2/config.yaml

Add the following. The tls-san entries ensure the generated TLS certificates are valid for the load balancer IP and all node hostnames:

tls-san:
  - 10.0.1.10
  - rke2-lb
  - rke2-ha-1
  - rke2-ha-2
  - rke2-ha-3

Enable and start the RKE2 server service:

sudo systemctl enable --now rke2-server.service

The first start takes a few minutes because RKE2 needs to initialize etcd, generate certificates, and pull container images. Watch the progress with journalctl:

sudo journalctl -u rke2-server -f

Wait until you see log messages indicating the API server is ready and etcd is running. Then verify the service status:

sudo systemctl status rke2-server

The service should show active (running). Now set up kubectl on this node:

export PATH=$PATH:/var/lib/rancher/rke2/bin
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
echo 'export PATH=$PATH:/var/lib/rancher/rke2/bin' >> ~/.bashrc
echo 'export KUBECONFIG=/etc/rancher/rke2/rke2.yaml' >> ~/.bashrc

Check that the first node is Ready:

kubectl get nodes

You should see one node in Ready state:

NAME        STATUS   ROLES                       AGE   VERSION
rke2-ha-1   Ready    control-plane,etcd,master   2m    v1.35.3+rke2r1

Retrieve the node token. The other nodes need this value to join the cluster:

sudo cat /var/lib/rancher/rke2/server/node-token

Copy the entire token string. You will paste it into the config files on the remaining two nodes.

Join the Remaining Nodes

SSH into rke2-ha-2 and create the configuration. The key difference from the first node is the server and token fields, which point to the load balancer IP on port 9345.

sudo mkdir -p /etc/rancher/rke2

Create the config file:

sudo vi /etc/rancher/rke2/config.yaml

Add the following configuration, replacing the token with the value from the first node:

server: https://10.0.1.10:9345
token: <paste-token-from-first-node>
tls-san:
  - 10.0.1.10
  - rke2-lb
  - rke2-ha-1
  - rke2-ha-2
  - rke2-ha-3

The server URL points to the load balancer on port 9345 (the RKE2 registration port). This means new nodes register through the LB, which forwards the request to whichever server node is available.

Start the service:

sudo systemctl enable --now rke2-server.service

Watch the logs to confirm it joins successfully:

sudo journalctl -u rke2-server -f

Once the second node shows as Ready, repeat the exact same process on rke2-ha-3 with the same config file contents. The third node joins the existing etcd cluster and brings the cluster to full 3-node quorum.

One important note: start the nodes one at a time. Wait for each node to fully join and show Ready before starting the next. Starting multiple nodes simultaneously can cause etcd election issues during cluster formation.

Verify the HA Cluster

Back on rke2-ha-1 (or any node with kubectl configured), check that all three nodes are Ready:

kubectl get nodes

All three nodes should show Ready with the control-plane and etcd roles:

NAME        STATUS   ROLES                       AGE     VERSION
rke2-ha-1   Ready    control-plane,etcd,master   12m     v1.35.3+rke2r1
rke2-ha-2   Ready    control-plane,etcd,master   6m      v1.35.3+rke2r1
rke2-ha-3   Ready    control-plane,etcd,master   2m      v1.35.3+rke2r1

Verify that the core system pods are running across all nodes:

kubectl get pods -n kube-system -o wide

The output should show three instances of each critical component distributed across the nodes:

NAME                                           READY   STATUS    NODE
etcd-rke2-ha-1                                 1/1     Running   rke2-ha-1
etcd-rke2-ha-2                                 1/1     Running   rke2-ha-2
etcd-rke2-ha-3                                 1/1     Running   rke2-ha-3
kube-apiserver-rke2-ha-1                       1/1     Running   rke2-ha-1
kube-apiserver-rke2-ha-2                       1/1     Running   rke2-ha-2
kube-apiserver-rke2-ha-3                       1/1     Running   rke2-ha-3
kube-controller-manager-rke2-ha-1              1/1     Running   rke2-ha-1
kube-controller-manager-rke2-ha-2              1/1     Running   rke2-ha-2
kube-controller-manager-rke2-ha-3              1/1     Running   rke2-ha-3
kube-scheduler-rke2-ha-1                       1/1     Running   rke2-ha-1
kube-scheduler-rke2-ha-2                       1/1     Running   rke2-ha-2
kube-scheduler-rke2-ha-3                       1/1     Running   rke2-ha-3
rke2-canal-4xm7j                              2/2     Running   rke2-ha-1
rke2-canal-8kn2r                              2/2     Running   rke2-ha-2
rke2-canal-t5f9w                              2/2     Running   rke2-ha-3
rke2-coredns-rke2-coredns-7d4785456f-abc12     1/1     Running   rke2-ha-1
rke2-coredns-rke2-coredns-7d4785456f-def34     1/1     Running   rke2-ha-2
rke2-coredns-rke2-coredns-autoscaler-xyz89     1/1     Running   rke2-ha-1
rke2-ingress-nginx-controller-rke2-ha-1        1/1     Running   rke2-ha-1
rke2-ingress-nginx-controller-rke2-ha-2        1/1     Running   rke2-ha-2
rke2-ingress-nginx-controller-rke2-ha-3        1/1     Running   rke2-ha-3
rke2-metrics-server-5d9bc78645-mn8kl           1/1     Running   rke2-ha-2
rke2-snapshot-controller-6f7b8c9d45-pq2st      1/1     Running   rke2-ha-3

Three etcd pods, three API servers, three Canal CNI pods (DaemonSet), and three Ingress NGINX controllers. That is a fully redundant control plane.

Check etcd cluster health using the built-in etcdctl:

sudo /var/lib/rancher/rke2/bin/etcdctl \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/var/lib/rancher/rke2/server/tls/etcd/server-ca.crt \
  --cert=/var/lib/rancher/rke2/server/tls/etcd/server-client.crt \
  --key=/var/lib/rancher/rke2/server/tls/etcd/server-client.key \
  member list --write-out=table

The output shows all three etcd members as started:

+------------------+---------+-----------+------------------------+------------------------+
|        ID        | STATUS  |   NAME    |       PEER ADDRS       |      CLIENT ADDRS      |
+------------------+---------+-----------+------------------------+------------------------+
| 1a2b3c4d5e6f7890 | started | rke2-ha-1 | https://10.0.1.11:2380 | https://10.0.1.11:2379 |
| 2b3c4d5e6f789012 | started | rke2-ha-2 | https://10.0.1.12:2380 | https://10.0.1.12:2379 |
| 3c4d5e6f78901234 | started | rke2-ha-3 | https://10.0.1.13:2380 | https://10.0.1.13:2379 |
+------------------+---------+-----------+------------------------+------------------------+

All three members are healthy. With three etcd nodes, the cluster tolerates one node failure while maintaining quorum (needs 2 of 3).

Deploy a Test Workload

Confirm workload scheduling works across all nodes by deploying a simple nginx deployment with three replicas:

kubectl create deployment nginx-test --image=nginx:latest --replicas=3

Wait a few seconds, then check pod distribution:

kubectl get pods -o wide -l app=nginx-test

Each replica should land on a different node:

NAME                          READY   STATUS    RESTARTS   AGE   IP           NODE
nginx-test-7c5b4f6d8-f4k2m   1/1     Running   0          30s   10.42.0.15   rke2-ha-1
nginx-test-7c5b4f6d8-h8n3p   1/1     Running   0          30s   10.42.1.12   rke2-ha-2
nginx-test-7c5b4f6d8-j2t9r   1/1     Running   0          30s   10.42.2.8    rke2-ha-3

Pods are evenly spread across the cluster. The scheduler distributes them by default because each node has equal resources. Expose the deployment to verify networking:

kubectl expose deployment nginx-test --port=80 --type=ClusterIP
kubectl get svc nginx-test

Clean up the test deployment when done:

kubectl delete deployment nginx-test
kubectl delete svc nginx-test

Access the Cluster Through the Load Balancer

In production, you want to interact with the cluster through the load balancer rather than connecting directly to a single node. If that node goes down, your kubectl session breaks. The LB ensures you always reach a healthy API server.

Copy the kubeconfig from the first node to your workstation:

scp [email protected]:/etc/rancher/rke2/rke2.yaml ~/.kube/rke2-ha-config

Replace the localhost address with the load balancer IP:

sed -i 's/127.0.0.1/10.0.1.10/g' ~/.kube/rke2-ha-config

On macOS, use sed -i '' 's/127.0.0.1/10.0.1.10/g' instead. Now test the connection through the load balancer:

KUBECONFIG=~/.kube/rke2-ha-config kubectl get nodes

You should see all three nodes, confirming that kubectl is reaching the API through the Nginx load balancer on 10.0.1.10:

NAME        STATUS   ROLES                       AGE   VERSION
rke2-ha-1   Ready    control-plane,etcd,master   15m   v1.35.3+rke2r1
rke2-ha-2   Ready    control-plane,etcd,master   10m   v1.35.3+rke2r1
rke2-ha-3   Ready    control-plane,etcd,master   6m    v1.35.3+rke2r1

This kubeconfig works from any machine that can reach the load balancer IP. For a deeper look at kubectl commands you will use daily, check our Kubernetes kubectl cheat sheet.

Production Hardening

The cluster is functional, but production readiness requires more than just “all nodes Ready.” Here are the areas to address before running real workloads.

Etcd Backups

RKE2 takes automatic etcd snapshots every 12 hours by default and retains 5 snapshots. For production, increase the frequency and store backups off-cluster. Add these lines to /etc/rancher/rke2/config.yaml on all server nodes:

etcd-snapshot-schedule-cron: "0 */6 * * *"
etcd-snapshot-retention: 10

This takes a snapshot every 6 hours and keeps 10 copies. Snapshots are stored in /var/lib/rancher/rke2/server/db/snapshots/. Copy them to an external location (S3, NFS, another server) using a cron job. Losing all three nodes without an etcd backup means rebuilding from scratch.

Take a manual snapshot and verify:

/var/lib/rancher/rke2/bin/rke2 etcd-snapshot save --name pre-production
/var/lib/rancher/rke2/bin/rke2 etcd-snapshot list

Certificate Rotation

RKE2 auto-generates all cluster TLS certificates with a 1-year expiry. Mark your calendar. When certificates expire, the API server stops accepting connections. RKE2 rotates certificates automatically on service restart when they are within 90 days of expiry. A planned restart once a year keeps things healthy:

sudo systemctl restart rke2-server

Check certificate expiration dates with:

for cert in /var/lib/rancher/rke2/server/tls/*.crt; do
  echo "$cert:"
  openssl x509 -enddate -noout -in "$cert"
done

Node Failure Tolerance

With three server nodes, the cluster tolerates exactly one failure. If two nodes go down simultaneously, etcd loses quorum and the API becomes read-only. For environments where two simultaneous failures are a real possibility (shared storage, same rack, same power circuit), consider running five server nodes. Five-node etcd tolerates two failures.

Test your failure tolerance by stopping the RKE2 service on one node and confirming the cluster still functions:

sudo systemctl stop rke2-server

From another node, run kubectl get nodes. The stopped node should show NotReady while the other two continue serving workloads. Start the service back up when satisfied:

sudo systemctl start rke2-server

Load Balancer Redundancy

The Nginx LB is currently a single point of failure. If it goes down, external kubectl access stops working (the nodes themselves continue to function). For production, run two LB instances with keepalived sharing a virtual IP, or use a cloud provider’s load balancer. The cluster itself is HA, but the entry point should be too.

Monitoring and Alerting

The cluster ships with metrics-server for basic resource monitoring. For production visibility, deploy a monitoring stack (Prometheus and Grafana, or Rancher Monitoring) to track etcd health, API server latency, node resource usage, and pod scheduling failures. Set up alerts for etcd quorum loss, certificate expiry within 30 days, and node NotReady transitions.

SELinux Context

RKE2 is designed to work with SELinux enforcing on RHEL-based systems. The installer configures the necessary SELinux policies automatically. Verify that SELinux is enforcing on all nodes:

getenforce

This should return Enforcing. Never set SELinux to permissive or disabled in production. If you encounter AVC denials from custom workloads, investigate with ausearch -m avc -ts recent and create targeted policies rather than disabling enforcement. RKE2’s bundled containerd and the rke2-selinux package handle the baseline policies for the cluster components.

Adding Worker Nodes

Running workloads on control plane nodes is acceptable for smaller clusters, but as you scale, dedicated worker nodes keep the control plane resources isolated. Install the RKE2 agent (not server) on worker nodes:

curl -sfL https://get.rke2.io | INSTALL_RKE2_TYPE="agent" sudo sh -

The agent config uses the same server and token fields but connects as a worker only, without running etcd or the API server. The RKE2 documentation covers agent configuration in detail.

For lightweight Kubernetes alternatives, especially on single nodes or edge deployments, have a look at our K3s quickstart on Ubuntu and Rocky Linux. Once the HA cluster is running, the next step is installing Rancher on top of it for centralized cluster management, or you can start managing nodes through the Rancher UI.