Running Kubernetes on Proxmox VE is one of the most practical ways to build a production-grade cluster without spending a fortune on cloud resources. Whether you are building a home lab, standing up a dev/test environment, or running a small production workload, Proxmox gives you a solid Type-1 hypervisor with an API you can automate against. This guide walks through every step: building a reusable Ubuntu 24.04 VM template, deploying a three-node cluster with kubeadm, and bolting on networking and storage so the cluster is actually useful.

Why Run Kubernetes on Proxmox VE

Proxmox VE is a Debian-based hypervisor that ships with KVM, LXC, and a full REST API out of the box. For Kubernetes, that combination solves several problems at once.

Home labs and learning. A single Proxmox host with 64 GB of RAM can run a three-node Kubernetes cluster and still have room for ancillary services like a registry, a database, or a monitoring stack. You get real VMs with real kernel isolation, which is closer to what you will see in production than running minikube on a laptop.

Dev/test environments. Proxmox templates and cloud-init let you spin up a fresh cluster in minutes. Tear it down, rebuild, repeat. If you add Terraform with the Proxmox provider, you can version-control the entire environment and hand it off to CI pipelines.

Small production workloads. For teams that do not need the scale of a managed Kubernetes service, a three-node Proxmox cluster with Ceph or ZFS storage can host Kubernetes with high availability at a fraction of the cost of EKS or GKE. You control the hardware, the networking, and the upgrade schedule.

The rest of this guide assumes you already have a working Proxmox VE 8.x installation. We will build everything from there.

Prerequisites

Before you start, confirm the following:

  • Proxmox VE 8.x installed and updated. Version 8.1 or later is recommended. Log in to the web UI and run pveversion to confirm.
  • Hardware resources: At least 8 CPU cores, 24 GB RAM, and 100 GB of free storage for the three VMs we will create. More is better, but this is the minimum for a functional cluster.
  • Network bridge: A Linux bridge (typically vmbr0) connected to your LAN. All VMs will attach to this bridge. If you use VLANs, make sure the bridge is VLAN-aware.
  • Ubuntu 24.04 cloud image: We will download the official cloud image in the template step below.
  • DNS or /etc/hosts entries: Each node needs a resolvable hostname. We will configure this with cloud-init, but make sure your local DNS or hosts file can resolve all three names.
  • SSH key pair: Generate one if you do not have one. Cloud-init will inject it into each VM so you can log in without a password.
  • Internet access on all VMs: Required to pull packages and container images during setup.

Here is the IP plan we will use throughout this guide. Adjust the addresses to match your network:

HostnameRoleIP AddressvCPUsRAMDisk
k8s-cp01Control plane192.168.1.5048 GB50 GB
k8s-worker01Worker192.168.1.5148 GB50 GB
k8s-worker02Worker192.168.1.5248 GB50 GB

Create an Ubuntu 24.04 VM Template with Cloud-Init

A VM template lets you clone identical machines in seconds. We will build one from the official Ubuntu 24.04 (Noble Numbat) cloud image, configure cloud-init support, and convert it to a template.

SSH into your Proxmox host and download the cloud image:

wget -O /var/lib/vz/template/iso/ubuntu-24.04-cloudimg-amd64.img \
  https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img

Create a new VM that will become the template. We use VM ID 9000 here, but pick any unused ID:

qm create 9000 --name ubuntu-2404-template --memory 4096 --cores 2 \
  --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-single --agent enabled=1

Import the cloud image as the primary disk and attach it to the VM:

qm set 9000 --scsi0 local-lvm:0,import-from=/var/lib/vz/template/iso/ubuntu-24.04-cloudimg-amd64.img

Add a cloud-init drive. This is where Proxmox injects user data, network config, and SSH keys at boot:

qm set 9000 --ide2 local-lvm:cloudinit

Set the boot order so the VM boots from the SCSI disk, and configure the serial console for cloud-init compatibility:

qm set 9000 --boot order=scsi0 --serial0 socket --vga serial0

Resize the disk so there is enough room for Kubernetes components, container images, and workloads:

qm disk resize 9000 scsi0 50G

Set default cloud-init values. These will be overridden per-clone, but it helps to have sane defaults in the template:

qm set 9000 --ciuser ubuntu --sshkeys ~/.ssh/id_rsa.pub --ipconfig0 ip=dhcp

Convert the VM to a template. After this, you cannot start it directly; you can only clone it:

qm template 9000

The template is ready. Every clone will inherit the 50 GB disk, cloud-init drive, virtio NIC, and QEMU guest agent configuration.

Clone Three VMs from the Template

Clone the template three times to create the control plane node and two workers. Full clones are independent of the template, so you can modify or delete the template later without affecting running VMs.

qm clone 9000 110 --name k8s-cp01 --full
qm clone 9000 111 --name k8s-worker01 --full
qm clone 9000 112 --name k8s-worker02 --full

Configure Static IPs and Hostnames via Cloud-Init

Each VM needs a static IP, a gateway, a DNS server, and a hostname. Set these through Proxmox cloud-init parameters so they take effect on first boot.

Configure the control plane node:

qm set 110 --ipconfig0 ip=192.168.1.50/24,gw=192.168.1.1 \
  --nameserver 192.168.1.1 --searchdomain lab.local

Configure the first worker:

qm set 111 --ipconfig0 ip=192.168.1.51/24,gw=192.168.1.1 \
  --nameserver 192.168.1.1 --searchdomain lab.local

Configure the second worker:

qm set 112 --ipconfig0 ip=192.168.1.52/24,gw=192.168.1.1 \
  --nameserver 192.168.1.1 --searchdomain lab.local

Now start all three VMs:

qm start 110
qm start 111
qm start 112

Give them 30 to 60 seconds to boot and apply cloud-init. Then verify SSH access from your workstation:

ssh [email protected] hostname
ssh [email protected] hostname
ssh [email protected] hostname

Each command should return the hostname you assigned. If SSH hangs or is refused, check that the VM has the correct IP and that your SSH public key was injected through cloud-init.

Add entries to /etc/hosts on all three nodes so they can resolve each other by name. SSH into each node and append:

cat <<EOF | sudo tee -a /etc/hosts
192.168.1.50 k8s-cp01
192.168.1.51 k8s-worker01
192.168.1.52 k8s-worker02
EOF

Prepare All Nodes

The following steps must be performed on all three nodes. SSH into each one, or use a tool like tmux with synchronized panes to run commands in parallel.

Disable Swap

Kubernetes requires swap to be turned off. The kubelet will refuse to start if swap is active.

sudo swapoff -a
sudo sed -i '/ swap / s/^/#/' /etc/fstab

Verify swap is off:

free -h

The swap line should show all zeros.

Load Kernel Modules

Kubernetes networking and containerd require the overlay and br_netfilter kernel modules. Load them now and make them persistent across reboots:

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

Set Sysctl Parameters

Enable IP forwarding and bridge netfilter so that iptables can see bridged traffic. Without these, pod-to-pod communication will not work:

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Confirm the values are applied:

sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

All three should return 1.

Install containerd on All Nodes

containerd is the standard container runtime for Kubernetes. We will install it from the official Docker repository, which ships the latest stable release.

Install prerequisite packages and add the Docker GPG key:

sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

Add the Docker repository. This gives us access to the containerd.io package:

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt-get update
sudo apt-get install -y containerd.io

Generate the default containerd configuration and enable the systemd cgroup driver. Kubernetes 1.32 expects systemd as the cgroup driver, and mismatches here cause the kubelet to crash-loop:

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

Restart containerd so it picks up the new configuration:

sudo systemctl restart containerd
sudo systemctl enable containerd
sudo systemctl status containerd --no-pager

You should see active (running) in the output. If it is not running, check journalctl -u containerd for errors.

Install kubeadm, kubelet, and kubectl

These three binaries are all you need to bootstrap and manage a Kubernetes cluster. We will pin them to version 1.32 so that future apt upgrades do not accidentally bump your cluster to an untested release.

Add the Kubernetes apt repository. Starting with Kubernetes 1.28, the project moved to community-managed package repositories hosted at pkgs.k8s.io:

sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | \
  sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /" | \
  sudo tee /etc/apt/sources.list.d/kubernetes.list

Install the packages and hold them at the current version:

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Verify the installed versions:

kubeadm version
kubectl version --client
kubelet --version

All three should report a 1.32.x version. If they do not, double-check that the repository URL contains v1.32 and re-run the install.

Initialize the Kubernetes Cluster on the Control Plane

The remaining steps in this section run only on the control plane node (k8s-cp01). SSH into 192.168.1.50 to proceed.

Run kubeadm init with the pod network CIDR that Calico expects by default:

sudo kubeadm init \
  --control-plane-endpoint=192.168.1.50 \
  --pod-network-cidr=192.168.0.0/16 \
  --apiserver-advertise-address=192.168.1.50 \
  --kubernetes-version=1.32.0

This takes a few minutes. kubeadm pulls the control plane images, generates certificates, and writes static pod manifests. When it finishes, you will see a message with a kubeadm join command. Copy that command and save it somewhere safe. You will need it to join the worker nodes.

Set up kubectl access for your non-root user on the control plane:

mkdir -p $HOME/.kube
sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Verify the cluster is responding:

kubectl get nodes

You should see k8s-cp01 in NotReady status. That is expected. The node will not become Ready until a CNI plugin is installed.

Install Calico CNI

Calico provides both networking and network policy enforcement. It is the most widely deployed CNI plugin in production Kubernetes clusters. We will install it using the Calico operator and custom resource method, which is the recommended approach for Kubernetes 1.32.

Install the Tigera operator:

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/tigera-operator.yaml

Download the custom resources manifest, verify the CIDR matches what you used in kubeadm init, and apply it:

curl -O https://raw.githubusercontent.com/projectcalico/calico/v3.28.0/manifests/custom-resources.yaml

Open custom-resources.yaml and confirm the cidr field is set to 192.168.0.0/16. If you used a different pod CIDR during kubeadm init, update it here to match. Then apply:

kubectl apply -f custom-resources.yaml

Watch the Calico pods come up:

watch kubectl get pods -n calico-system

Wait until all pods show Running status. This typically takes two to three minutes. Once Calico is healthy, check the node status again:

kubectl get nodes

The control plane node should now show Ready.

Join Worker Nodes to the Cluster

SSH into each worker node and run the kubeadm join command that was printed at the end of kubeadm init. The command looks like this:

sudo kubeadm join 192.168.1.50:6443 --token <your-token> \
  --discovery-token-ca-cert-hash sha256:<your-hash>

If you lost the join command or the token has expired (tokens expire after 24 hours), generate a new one from the control plane:

kubeadm token create --print-join-command

Run the join command on both worker nodes. When each one finishes, go back to the control plane and check:

kubectl get nodes

After a minute or two, all three nodes should show Ready:

NAME            STATUS   ROLES           AGE     VERSION
k8s-cp01        Ready    control-plane   10m     v1.32.0
k8s-worker01    Ready    <none>          2m      v1.32.0
k8s-worker02    Ready    <none>          90s     v1.32.0

Optionally, label the worker nodes so you can target them with node selectors later:

kubectl label node k8s-worker01 node-role.kubernetes.io/worker=worker
kubectl label node k8s-worker02 node-role.kubernetes.io/worker=worker

Run a quick smoke test to verify the cluster is functional:

kubectl run nginx-test --image=nginx --restart=Never
kubectl get pod nginx-test -o wide

The pod should land on one of the worker nodes and reach Running status within 30 seconds. Clean it up when you are done:

kubectl delete pod nginx-test

Optional: Automate VM Creation with Terraform

Clicking through the Proxmox UI or running qm commands by hand works fine for a one-time setup, but if you plan to rebuild clusters regularly, Terraform is a better approach. The bpg/proxmox Terraform provider (formerly Telmate/proxmox) supports cloud-init, full clones, and all the VM settings we configured above.

Here is a minimal Terraform configuration that creates the three VMs from our template. Create a file named main.tf:

terraform {
  required_providers {
    proxmox = {
      source  = "bpg/proxmox"
      version = ">= 0.66.0"
    }
  }
}

provider "proxmox" {
  endpoint = "https://proxmox.lab.local:8006/"
  username = "root@pam"
  password = var.proxmox_password
  insecure = true
}

variable "proxmox_password" {
  type      = string
  sensitive = true
}

variable "nodes" {
  default = {
    "k8s-cp01"     = { vmid = 110, ip = "192.168.1.50/24" }
    "k8s-worker01" = { vmid = 111, ip = "192.168.1.51/24" }
    "k8s-worker02" = { vmid = 112, ip = "192.168.1.52/24" }
  }
}

resource "proxmox_virtual_environment_vm" "k8s" {
  for_each  = var.nodes
  name      = each.key
  node_name = "pve"
  vm_id     = each.value.vmid

  clone {
    vm_id = 9000
    full  = true
  }

  cpu {
    cores = 4
  }

  memory {
    dedicated = 8192
  }

  initialization {
    user_account {
      username = "ubuntu"
      keys     = [file("~/.ssh/id_rsa.pub")]
    }
    ip_config {
      ipv4 {
        address = each.value.ip
        gateway = "192.168.1.1"
      }
    }
    dns {
      servers = ["192.168.1.1"]
      domain  = "lab.local"
    }
  }
}

Initialize and apply:

terraform init
terraform plan
terraform apply

Terraform will create all three VMs in parallel, which is significantly faster than sequential qm clone commands. When you want to tear down the cluster and start fresh, run terraform destroy and the VMs are gone. Combine this with an Ansible playbook for the Kubernetes installation steps, and you have a fully automated pipeline from bare Proxmox to a running cluster.

Storage: Longhorn or Local-Path for Persistent Volumes

A Kubernetes cluster without persistent storage is useful only for stateless workloads. For anything that needs to survive a pod restart (databases, message queues, monitoring data), you need a StorageClass and a CSI driver. Here are two solid options for Proxmox-based clusters.

Option 1: Rancher Local Path Provisioner

This is the simplest option. It creates PersistentVolumes backed by directories on the node’s local filesystem. There is no replication, so if a node dies, the data on that node is gone. That said, for dev/test and single-node workloads, it is perfectly adequate.

Install it with a single command:

kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.30/deploy/local-path-storage.yaml

Set it as the default StorageClass if you want PVCs to use it automatically:

kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Option 2: Longhorn

Longhorn is a distributed block storage system built for Kubernetes. It replicates data across nodes, supports snapshots and backups, and has a web UI for management. This is the better choice if you need data resilience across your cluster.

Install the prerequisites on all nodes:

sudo apt-get install -y open-iscsi nfs-common
sudo systemctl enable iscsid --now

Install Longhorn using Helm:

helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn \
  --namespace longhorn-system \
  --create-namespace \
  --version 1.7.2

Monitor the installation:

kubectl -n longhorn-system get pods -w

Once all pods are running, Longhorn creates a default StorageClass called longhorn. You can access the Longhorn UI by port-forwarding the frontend service:

kubectl -n longhorn-system port-forward svc/longhorn-frontend 8080:80

Then open http://localhost:8080 in your browser to manage volumes, replicas, and backups.

Troubleshooting

Here are the most common issues you will run into and how to fix them.

Node stays in NotReady status

This almost always means the CNI plugin is not installed or not healthy. Check Calico pod status:

kubectl get pods -n calico-system
kubectl describe pods -n calico-system

If Calico pods are in CrashLoopBackOff, the most likely cause is a pod CIDR mismatch between kubeadm init and the Calico custom resources. Verify both use the same CIDR.

kubelet fails to start

Check the kubelet logs:

sudo journalctl -u kubelet -f

Common causes:

  • Swap is still on. Run free -h and verify the swap line shows zeros. Run sudo swapoff -a if needed.
  • Cgroup driver mismatch. The kubelet expects systemd, but containerd is set to cgroupfs. Verify SystemdCgroup = true in /etc/containerd/config.toml and restart containerd.
  • containerd is not running. Check with sudo systemctl status containerd.

kubeadm join fails with token errors

Tokens expire after 24 hours. Generate a new one from the control plane:

kubeadm token create --print-join-command

Also verify that the worker node can reach the control plane on port 6443. Test with:

curl -k https://192.168.1.50:6443/healthz

You should get back ok. If the connection times out, check firewall rules on both the Proxmox host and the VM.

Pods stuck in Pending or ContainerCreating

Describe the pod to find out why:

kubectl describe pod <pod-name>

Common reasons:

  • Insufficient resources. The scheduler cannot find a node with enough CPU or memory. Check kubectl describe node and look at the Allocatable vs. Allocated sections.
  • Image pull errors. The node cannot reach the container registry. Verify DNS resolution and internet connectivity on the node.
  • PVC not bound. If the pod requests a PersistentVolumeClaim and no StorageClass is available, the PVC stays in Pending. Install one of the storage solutions described above.

Cloud-init did not apply on the VM

If a cloned VM boots with DHCP instead of the static IP you configured, cloud-init may not have run. Check inside the VM:

sudo cloud-init status
cat /var/log/cloud-init-output.log

Make sure the cloud-init drive (ide2) is present in the VM configuration. From the Proxmox host:

qm config 110 | grep ide2

If the cloud-init drive is missing, add it with qm set 110 --ide2 local-lvm:cloudinit, regenerate the cloud-init image with qm cloudinit update 110, and reboot the VM.

QEMU guest agent not responding

The QEMU guest agent lets Proxmox report the VM’s IP address and perform clean shutdowns. If it is not working, install it inside the VM:

sudo apt-get install -y qemu-guest-agent
sudo systemctl enable qemu-guest-agent --now

Verify in the Proxmox web UI that the VM shows an IP address on the Summary tab.

Summary

You now have a three-node Kubernetes 1.32 cluster running on Proxmox VE 8.x with Calico networking and your choice of persistent storage. The cloud-init template approach means you can destroy and recreate the cluster quickly, and the optional Terraform workflow takes that a step further by making the entire setup declarative and repeatable.

From here, consider adding an Ingress controller (Nginx or Traefik), a monitoring stack (Prometheus and Grafana), and a GitOps tool (Flux or ArgoCD) to round out the environment. If you plan to run this in production, look into setting up multiple control plane nodes for high availability and backing your Proxmox storage with Ceph for redundancy at the hypervisor level.

LEAVE A REPLY

Please enter your comment!
Please enter your name here