Deploy HA RKE2 Kubernetes on Rocky Linux 10 [Guide]

RKE2 (Rancher Kubernetes Engine 2) is a CNCF-certified Kubernetes distribution built by Rancher/SUSE. It combines the ease of deployment from K3s with close upstream Kubernetes alignment, and it does not depend on Docker – control plane components run as static pods managed by the kubelet. RKE2 is also known as “RKE Government” due to its focus on security and FIPS 140-2 compliance.

Original content from computingforgeeks.com - post 120383

This guide walks through deploying a highly available RKE2 Kubernetes cluster on Rocky Linux 10 / AlmaLinux 10 with 3 server (control plane) nodes and 2 agent (worker) nodes. The latest stable RKE2 release at the time of writing is v1.35.1+rke2r1, which ships Kubernetes v1.35.1, etcd v3.6.7, and Containerd v2.1.5.

Prerequisites

5 servers running Rocky Linux 10 or AlmaLinux 10 (3 server nodes + 2 agent nodes)
Minimum 4 GB RAM and 2 vCPUs per node (8 GB / 4 vCPUs recommended for server nodes)
Root or sudo access on all nodes
Network connectivity between all nodes on required ports (see firewall section)
A fixed registration address – either a load balancer, DNS round-robin, or VIP pointing to the server nodes

Cluster Node Layout

We use the following node layout for this HA RKE2 deployment on Rocky Linux 10:

Role	Hostname	IP Address
Server Node 1	server1	10.0.1.10
Server Node 2	server2	10.0.1.11
Server Node 3	server3	10.0.1.12
Agent Node 1	agent1	10.0.1.20
Agent Node 2	agent2	10.0.1.21

In this setup, server1 (10.0.1.10) serves as both the first server node and the fixed registration address. For production environments, place an Nginx load balancer in front of the Kubernetes API on ports 9345 and 6443.

Step 1: Prepare All Nodes

Run the following steps on all 5 nodes. Set the hostname on each node accordingly:

sudo hostnamectl set-hostname server1

Repeat with server2, server3, agent1, and agent2 on the respective nodes.

Add all cluster nodes to /etc/hosts on every node:

sudo vi /etc/hosts

Add the following entries:

10.0.1.10 server1
10.0.1.11 server2
10.0.1.12 server3
10.0.1.20 agent1
10.0.1.21 agent2

Install required packages:

sudo dnf -y install curl vim wget

Load the required kernel modules and make them persistent across reboots:

sudo modprobe br_netfilter
sudo modprobe overlay

Create a modules load file to persist them:

sudo vi /etc/modules-load.d/rke2.conf

Add the following content:

br_netfilter
overlay

Set the required sysctl parameters:

sudo vi /etc/sysctl.d/99-rke2.conf

Add these lines:

net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1

Apply the sysctl settings:

sudo sysctl --system

Step 2: Configure Firewall Rules for RKE2

RKE2 requires several ports to be open between cluster nodes. On all server nodes, open the following ports:

sudo firewall-cmd --permanent --add-port=6443/tcp
sudo firewall-cmd --permanent --add-port=9345/tcp
sudo firewall-cmd --permanent --add-port=10250/tcp
sudo firewall-cmd --permanent --add-port=2379-2380/tcp
sudo firewall-cmd --permanent --add-port=8472/udp
sudo firewall-cmd --permanent --add-port=4789/udp
sudo firewall-cmd --permanent --add-port=51820-51821/udp
sudo firewall-cmd --permanent --add-port=30000-32767/tcp
sudo firewall-cmd --permanent --add-masquerade
sudo firewall-cmd --reload

On all agent nodes, open these ports:

sudo firewall-cmd --permanent --add-port=10250/tcp
sudo firewall-cmd --permanent --add-port=8472/udp
sudo firewall-cmd --permanent --add-port=4789/udp
sudo firewall-cmd --permanent --add-port=51820-51821/udp
sudo firewall-cmd --permanent --add-port=30000-32767/tcp
sudo firewall-cmd --permanent --add-masquerade
sudo firewall-cmd --reload

Here is a summary of the required ports:

Port	Protocol	Purpose
6443	TCP	Kubernetes API server
9345	TCP	RKE2 supervisor API (node registration)
10250	TCP	Kubelet metrics
2379-2380	TCP	etcd client and peer communication
8472	UDP	Canal/Flannel VXLAN
4789	UDP	Calico VXLAN (if using Calico CNI)
51820-51821	UDP	Canal/Flannel WireGuard
30000-32767	TCP	NodePort services range

Step 3: Install RKE2 on the First Server Node

SSH into server1 (10.0.1.10) and download the RKE2 installer script:

curl -sfL https://get.rke2.io --output install.sh
chmod +x install.sh

Run the installer in server mode:

sudo INSTALL_RKE2_TYPE=server ./install.sh

Create the RKE2 configuration directory and config file:

sudo mkdir -p /etc/rancher/rke2
sudo vi /etc/rancher/rke2/config.yaml

Add the following configuration. The tls-san entries ensure the Kubernetes API certificate is valid for both the hostname and IP of the registration address:

write-kubeconfig-mode: "0644"
tls-san:
  - server1
  - 10.0.1.10

Enable and start the RKE2 server service:

sudo systemctl enable --now rke2-server

The first server node takes a few minutes to start as it bootstraps etcd and the control plane. Verify the service is running:

sudo systemctl status rke2-server

Retrieve the node token – you will need this to join additional server and agent nodes to the cluster:

sudo cat /var/lib/rancher/rke2/server/node-token

Save the output. It looks similar to this:

K1079187d01ac73b1a17261a475cb1b8486144543fc59a189e0c4533ef252a26450::server:33f5c1a2b7721992be25e340ded19cac

Step 4: Configure kubectl on the First Server

RKE2 installs its own kubectl binary under /var/lib/rancher/rke2/bin/. Add this to your PATH and set the kubeconfig environment variable. If you are familiar with kubectl commands and shortcuts, you can also set up shell aliases at this point.

sudo vi ~/.bashrc

Add these lines at the end of the file:

export PATH=$PATH:/var/lib/rancher/rke2/bin
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml

Source the file to apply changes in the current session:

source ~/.bashrc

Verify the first node is ready:

kubectl get nodes

Expected output:

$ kubectl get nodes
NAME      STATUS   ROLES                       AGE   VERSION
server1   Ready    control-plane,etcd,master    2m    v1.35.1+rke2r1

Check that all system pods are running:

kubectl get pods -A

Step 5: Join Additional Server Nodes to the RKE2 Cluster

Run the following on server2 (10.0.1.11) and server3 (10.0.1.12). Download and run the RKE2 installer:

curl -sfL https://get.rke2.io --output install.sh
chmod +x install.sh
sudo INSTALL_RKE2_TYPE=server ./install.sh

Create the configuration file:

sudo mkdir -p /etc/rancher/rke2
sudo vi /etc/rancher/rke2/config.yaml

Add the following, replacing the token value with the one obtained from server1:

server: https://10.0.1.10:9345
token: K1079187d01ac73b1a17261a475cb1b8486144543fc59a189e0c4533ef252a26450::server:33f5c1a2b7721992be25e340ded19cac
write-kubeconfig-mode: "0644"
tls-san:
  - server1
  - 10.0.1.10

Enable and start the RKE2 server service. Join one server at a time – wait for each node to reach Ready status before starting the next one:

sudo systemctl enable --now rke2-server

After both additional server nodes have joined, verify from server1:

$ kubectl get nodes
NAME      STATUS   ROLES                       AGE   VERSION
server1   Ready    control-plane,etcd,master    10m   v1.35.1+rke2r1
server2   Ready    control-plane,etcd,master    5m    v1.35.1+rke2r1
server3   Ready    control-plane,etcd,master    2m    v1.35.1+rke2r1

All three server nodes are now part of the HA control plane with an embedded etcd cluster. The cluster can tolerate the loss of one server node and continue operating.

Step 6: Join Agent Nodes (Workers) to the Cluster

Run the following on agent1 (10.0.1.20) and agent2 (10.0.1.21). Download and run the installer in agent mode:

curl -sfL https://get.rke2.io --output install.sh
chmod +x install.sh
sudo INSTALL_RKE2_TYPE=agent ./install.sh

Create the agent configuration file:

sudo mkdir -p /etc/rancher/rke2
sudo vi /etc/rancher/rke2/config.yaml

Add the following configuration, using the same token from server1:

server: https://10.0.1.10:9345
token: K1079187d01ac73b1a17261a475cb1b8486144543fc59a189e0c4533ef252a26450::server:33f5c1a2b7721992be25e340ded19cac

Enable and start the RKE2 agent service:

sudo systemctl enable --now rke2-agent

Check the agent service status:

sudo systemctl status rke2-agent

After both agents join, verify the full cluster from server1:

$ kubectl get nodes
NAME      STATUS   ROLES                       AGE   VERSION
server1   Ready    control-plane,etcd,master    15m   v1.35.1+rke2r1
server2   Ready    control-plane,etcd,master    10m   v1.35.1+rke2r1
server3   Ready    control-plane,etcd,master    7m    v1.35.1+rke2r1
agent1    Ready    <none>                       3m    v1.35.1+rke2r1
agent2    Ready    <none>                       1m    v1.35.1+rke2r1

Step 7: Verify the RKE2 Kubernetes Cluster

Confirm all system pods are running across the cluster:

kubectl get pods -A

You should see pods for CoreDNS, the NGINX Ingress Controller, metrics-server, Canal CNI, and etcd running in the kube-system namespace. All pods should show Running or Completed status.

Check cluster component health:

kubectl cluster-info
kubectl get componentstatuses

Step 8: Access the Cluster from Outside

To manage the cluster from a workstation outside the cluster, copy the kubeconfig file from server1. If you need to install kubectl on your local machine, do that first.

scp [email protected]:/etc/rancher/rke2/rke2.yaml ~/.kube/config

Edit the kubeconfig to point to the server’s external IP instead of 127.0.0.1:

sed -i 's/127.0.0.1/10.0.1.10/g' ~/.kube/config

Verify connectivity from your workstation:

kubectl get nodes

Step 9: Deploy a Test Application

Deploy a sample Nginx application to confirm the cluster is functioning correctly. Create the deployment manifest:

sudo vi /tmp/nginx-test.yaml

Add the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-test
  template:
    metadata:
      labels:
        app: nginx-test
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-test
spec:
  type: NodePort
  selector:
    app: nginx-test
  ports:
  - port: 80
    targetPort: 80

Apply the manifest:

kubectl apply -f /tmp/nginx-test.yaml

Verify the pods are running:

$ kubectl get pods -l app=nginx-test
NAME                          READY   STATUS    RESTARTS   AGE
nginx-test-6d4cf56db6-2xk9m   1/1     Running   0          30s
nginx-test-6d4cf56db6-f7h3w   1/1     Running   0          30s

Get the NodePort assigned to the service:

$ kubectl get svc nginx-test
NAME         TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
nginx-test   NodePort   10.43.85.120   <none>        80:31042/TCP   15s

Access the application using any node IP with the assigned NodePort. For example, http://10.0.1.20:31042 should return the default Nginx welcome page.

Clean up the test deployment when done:

kubectl delete -f /tmp/nginx-test.yaml

RKE2 Cluster Management Tips

Check the RKE2 server logs for troubleshooting:

sudo journalctl -u rke2-server -f

For agent nodes, check the agent logs:

sudo journalctl -u rke2-agent -f

To uninstall RKE2 from a server node:

sudo /usr/bin/rke2-uninstall.sh

To uninstall from an agent node:

sudo /usr/bin/rke2-agent-uninstall.sh

Once your cluster is operational, you can deploy an Nginx Ingress Controller with Helm for production traffic routing, though RKE2 already includes an NGINX Ingress Controller by default in the kube-system namespace.

Conclusion

We have deployed a 5-node HA Kubernetes cluster using RKE2 on Rocky Linux 10 / AlmaLinux 10 with 3 control plane nodes and 2 worker nodes. The embedded etcd cluster provides fault tolerance – the cluster survives the loss of any single server node.

For production use, add a dedicated load balancer in front of the API server on port 6443, configure automated etcd backups, set up monitoring with Prometheus and Grafana, and enable RBAC policies to restrict cluster access.