This guide walks you through deploying a production-ready Kubernetes 1.32 cluster on Ubuntu 24.04 LTS using kubeadm. We will configure containerd as the container runtime, set up networking with Calico CNI, and verify the cluster with a test workload. Every command is tested on fresh Ubuntu 24.04 (Noble Numbat) minimal server installs.

Ubuntu 24.04 ships with a modern kernel (6.8+), systemd-resolved by default, and uses APT with signed-by keyrings for repository management. This guide is written specifically for the Debian/Ubuntu toolchain – if you are running Rocky Linux or AlmaLinux, refer to our Kubernetes cluster setup guide for RHEL-based systems instead.

Prerequisites

You need three Ubuntu 24.04 LTS servers with the following minimum specifications:

HostnameRoleIP AddressRAMCPU
k8s-master01Control Plane192.168.1.104 GB2 vCPU
k8s-worker01Worker192.168.1.114 GB2 vCPU
k8s-worker02Worker192.168.1.124 GB2 vCPU

All three nodes must have unique hostnames, MAC addresses, and product_uuid values. Ensure each node can reach the others over the network and has unrestricted outbound internet access for pulling images and packages.

Set Hostnames on All Nodes

Run the appropriate command on each node:

sudo hostnamectl set-hostname k8s-master01   # On control plane
sudo hostnamectl set-hostname k8s-worker01   # On first worker
sudo hostnamectl set-hostname k8s-worker02   # On second worker

Add all nodes to /etc/hosts on every machine so they can resolve each other by name:

cat <<EOF | sudo tee -a /etc/hosts
192.168.1.10  k8s-master01
192.168.1.11  k8s-worker01
192.168.1.12  k8s-worker02
EOF

Disable Swap

Kubelet refuses to start if swap is active. Ubuntu 24.04 may have a swap file enabled by default. Disable it permanently:

sudo swapoff -a
sudo sed -i '/\sswap\s/s/^/#/' /etc/fstab

Verify swap is off with free -h – the swap row should show all zeros.

Load Kernel Modules and Set Sysctl Parameters

Kubernetes networking requires the overlay and br_netfilter kernel modules. Create a configuration file so they load on every boot:

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

Next, configure the required sysctl parameters for packet forwarding and bridge traffic:

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

sudo sysctl --system

Confirm the values are applied:

sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward

All three should return 1.

Step 1 – Install containerd Container Runtime

Perform these steps on all three nodes. We install containerd from Docker’s official apt repository because it ships a more current version than the Ubuntu universe repository and receives faster security patches.

Start by installing prerequisite packages and adding Docker’s GPG key:

sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

Add the Docker repository. Note that Ubuntu 24.04 uses the noble codename:

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Install only the containerd.io package – you do not need the full Docker Engine for Kubernetes:

sudo apt-get update
sudo apt-get install -y containerd.io

Configure containerd to Use systemd Cgroup Driver

Kubernetes 1.32 requires the systemd cgroup driver. Generate the default containerd configuration and enable it:

sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml

Restart and enable containerd:

sudo systemctl restart containerd
sudo systemctl enable containerd

Verify it is running:

sudo systemctl status containerd

The output should show active (running). If containerd fails to start, check the config file syntax with containerd config dump.

Step 2 – Install kubeadm, kubelet, and kubectl

Run these commands on all three nodes. The Kubernetes project hosts its own apt repository at pkgs.k8s.io with version-specific channels. We will use the v1.32 channel.

Add the Kubernetes apt signing key and repository:

sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

Install the packages and hold them at the current version to prevent unintended upgrades breaking your cluster:

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Enable the kubelet service so it starts automatically after a reboot:

sudo systemctl enable kubelet

At this point kubelet will be crash-looping – that is expected. It will stabilize once kubeadm init completes and provides the necessary configuration.

Step 3 – Configure UFW Firewall Rules

Ubuntu uses UFW (Uncomplicated Firewall) as its default firewall management tool, not firewalld. If UFW is active on your nodes, you need to open the required Kubernetes ports.

Control Plane Node Firewall Rules

Run these on k8s-master01 only:

sudo ufw allow 6443/tcp        # Kubernetes API server
sudo ufw allow 2379:2380/tcp   # etcd server client API
sudo ufw allow 10250/tcp       # Kubelet API
sudo ufw allow 10259/tcp       # kube-scheduler
sudo ufw allow 10257/tcp       # kube-controller-manager
sudo ufw allow 179/tcp         # Calico BGP
sudo ufw allow 4789/udp        # Calico VXLAN
sudo ufw reload

Worker Node Firewall Rules

Run these on k8s-worker01 and k8s-worker02:

sudo ufw allow 10250/tcp        # Kubelet API
sudo ufw allow 10256/tcp        # kube-proxy
sudo ufw allow 30000:32767/tcp  # NodePort Services
sudo ufw allow 179/tcp          # Calico BGP
sudo ufw allow 4789/udp         # Calico VXLAN
sudo ufw reload

If UFW is not enabled on your nodes (check with sudo ufw status), you can skip this section. Many lab environments run without a host firewall, but production deployments should have these rules in place.

Step 4 – Initialize the Control Plane with kubeadm

Run this on the control plane node only (k8s-master01). We specify --pod-network-cidr=192.168.0.0/16 because that is the default CIDR that Calico expects. If your node LAN already uses the 192.168.0.0/16 range, change the pod CIDR to something else like 10.244.0.0/16 and adjust the Calico configuration accordingly.

sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=stable-1.32

The initialization takes a few minutes. When it finishes, you will see output containing a kubeadm join command with a token. Copy and save this entire command – you will need it to join the worker nodes.

Set up kubectl access for your regular user account:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Verify the control plane is responding:

kubectl cluster-info

You should see the Kubernetes control plane address and CoreDNS listed. The nodes will show as NotReady until the CNI plugin is installed in the next step.

Step 5 – Install Calico CNI Plugin

Without a CNI plugin, pods cannot communicate across nodes and CoreDNS pods will stay in Pending state. We use Calico because it provides both networking and network policy enforcement. For a deeper look at Kubernetes networking options, see our comparison of Kubernetes CNI plugins.

Install the Tigera Calico operator and custom resource definitions:

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.2/manifests/tigera-operator.yaml

Download the custom resources manifest and apply it. If you used the default pod CIDR (192.168.0.0/16), no edits are needed:

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.2/manifests/custom-resources.yaml

Watch the Calico pods come up in the calico-system namespace:

kubectl get pods -n calico-system -w

Wait until all Calico pods show Running status. This usually takes two to three minutes as the images are pulled. Once Calico is operational, check that the control plane node is now Ready:

kubectl get nodes

Step 6 – Join Worker Nodes to the Cluster

Run the kubeadm join command that was printed at the end of kubeadm init on each worker node. The command looks like this:

sudo kubeadm join 192.168.1.10:6443 --token <your-token> --discovery-token-ca-cert-hash sha256:<your-hash>

If you lost the join command or the token has expired (tokens are valid for 24 hours), generate a new one from the control plane:

kubeadm token create --print-join-command

After joining both workers, verify the full cluster from the control plane:

kubectl get nodes -o wide

All three nodes should show Ready status. The INTERNAL-IP column should display each node’s LAN address, and the CONTAINER-RUNTIME column should show containerd:// followed by the version.

Step 7 – Deploy a Test Application

Create an nginx deployment with two replicas and expose it through a NodePort service to verify cross-node pod communication:

kubectl create deployment nginx-test --image=nginx:latest --replicas=2
kubectl expose deployment nginx-test --type=NodePort --port=80

Check that the pods are scheduled across different worker nodes:

kubectl get pods -o wide

Find the assigned NodePort:

kubectl get svc nginx-test

The PORT(S) column will show something like 80:31234/TCP. You can access the nginx welcome page by opening http://192.168.1.11:31234 or any node IP with that port in a browser. If the page loads, your cluster networking is fully operational.

Clean up the test deployment when done:

kubectl delete deployment nginx-test
kubectl delete svc nginx-test

Step 8 – Install Kubernetes Dashboard (Optional)

The Kubernetes Dashboard provides a web-based UI for managing workloads, viewing logs, and monitoring cluster resources. It is entirely optional but helpful for teams that prefer a graphical interface. For a complete walkthrough, see our dedicated Kubernetes Dashboard installation guide.

Deploy the dashboard using Helm (the recommended method for Dashboard v7+):

helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard

Create a service account and cluster role binding for dashboard access:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard
EOF

Generate a login token:

kubectl -n kubernetes-dashboard create token admin-user

Access the dashboard by running kubectl proxy from your workstation and opening the dashboard URL in your browser, or by exposing it via a NodePort or Ingress depending on your environment.

Install metrics-server

The kubectl top command and the Dashboard’s resource graphs require metrics-server. Install it with:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

In self-signed or internal certificate environments, you may need to add the --kubelet-insecure-tls argument to the metrics-server deployment:

kubectl -n kube-system patch deployment metrics-server --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'

After a minute or two, verify that metrics are available:

kubectl top nodes
kubectl top pods -A

Troubleshooting Common Issues

kubelet Fails to Start

Check the kubelet logs for specific errors:

sudo journalctl -u kubelet -f

The most common cause on Ubuntu 24.04 is swap still being active. Confirm with free -h and make sure the /etc/fstab swap entry is commented out.

Nodes Stuck in NotReady State

This almost always means the CNI plugin is not installed or not running correctly. Check the Calico pods:

kubectl get pods -n calico-system
kubectl get pods -n tigera-operator

If Calico pods are in CrashLoopBackOff or ImagePullBackOff, check that your nodes have internet access and can reach quay.io and docker.io container registries.

CoreDNS Pods Stuck in Pending

CoreDNS will not schedule until a CNI is installed. If you have Calico running and CoreDNS is still Pending, describe the pods to check for scheduling issues:

kubectl describe pods -n kube-system -l k8s-app=kube-dns

kubeadm init Fails with Port Already in Use

If you are re-running kubeadm init after a failed attempt, reset the node first:

sudo kubeadm reset -f
sudo rm -rf /etc/cni/net.d
sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X

Then run kubeadm init again.

Token Expired When Joining Workers

Bootstrap tokens created during kubeadm init expire after 24 hours. Generate a fresh token and join command:

kubeadm token create --print-join-command

containerd Fails After Config Changes

If you edited /etc/containerd/config.toml and containerd will not start, regenerate a clean config and reapply the systemd cgroup change:

containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd

UFW Blocking Cluster Communication

If nodes appear Ready but pods cannot communicate across nodes, UFW may be silently dropping traffic. Temporarily disable UFW to test:

sudo ufw disable

If the problem resolves, re-enable UFW and add the missing rules from Step 3. Pay attention to the Calico VXLAN port (4789/udp) and BGP port (179/tcp) – these are frequently forgotten.

Summary

You now have a fully functional three-node Kubernetes 1.32 cluster running on Ubuntu 24.04 LTS with containerd as the runtime and Calico handling pod networking. From here you can deploy workloads, configure Ingress controllers, set up persistent storage, or add more worker nodes by running the kubeadm join command on additional Ubuntu machines.

Key points to remember for ongoing maintenance:

  • Kubernetes packages are held with apt-mark hold – when you are ready to upgrade, unhold them, update the repository channel to the next minor version, and follow the official kubeadm upgrade procedure.
  • The containerd configuration at /etc/containerd/config.toml must keep SystemdCgroup = true – resetting this will break your cluster.
  • Calico’s operator-based deployment makes upgrades straightforward – update the operator manifest URL to the newer version and reapply.
  • Back up your etcd data regularly. On a single control plane setup, losing the master node means losing the entire cluster state.

82 COMMENTS

  1. Several issues in these steps. Trying to follow with containerd setup – an example error is no step to create /var/lib/kubelet/config.yaml. Another, it enters into a root bash but doesn’t exit. And kubeadm config images pull leads to ‘unknown service runtime.v1alpha2.ImageService’ error.

  2. This not working:
    sudo kubeadm config images pull
    failed to pull image “k8s.gcr.io/kube-apiserver:v1.20.5″: output: time=”2021-03-25T12:37:30Z” level=fatal msg=”pulling image failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService”
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher

    • Hi Guido, Because the command install kubelet doesn’t determine the version, if you follow command of this guide, it will install the last version of kubelet, it doesn’t support on Ubuntu 20.4.
      Replace command install kubelet, kubeadm, kubectl as below:
      apt update && apt install -y vim git curl wget kubeadm=1.22.1-00 kubelet=1.22.1-00 kubectl=1.22.1-00

      Hope it’s helpful for you.

    • Hi Guido, Because the command install kubelet doesn’t determine the version, if you follow command of this guide, it will install the last version of kubelet, it doesn’t support on Ubuntu 20.4.
      Replace command install kubelet, kubeadm, kubectl as below:
      apt update && apt install -y vim git curl wget kubeadm=1.22.1-00 kubelet=1.22.1-00 kubectl=1.22.1-00

      Hope it is helpful for you.

  3. after reproducing the steps mentioned I found out that kubectl get nodes does not show the nodes
    and shows the error The connection to the server localhost:8080 was refused – did you specify the right host or port?
    Any help asap would be highly appreciated

    • Follow below commands and execute the “kubectl get nodes” using non-root user
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config

  4. Some issues getting this to work. Here are the items I changed to get it going:

    First, there was an issue getting containerd to start. The issue was that the config was not saved properly. In the “Installing containerd” section above, there was a missing `>` in the following command (I really hope my pre and /pre tags work):

    containerd config default /etc/containerd/config.toml

    It needs to be:

    containerd config default > /etc/containerd/config.toml

    Then there was an issue with flannel but that resolved itself:
    https://github.com/flannel-io/flannel/issues/1482

    Then, once the control plane node was up, the worker nodes would not go to ready state. I resolved this by setting the containerd.default_runtime (Again, I really hope my pre and /pre tags work):

    [plugins.”io.containerd.grpc.v1.cri”.containerd.default_runtime]
    runtime_type = “io.containerd.runtime.v1.linux”

    Hopefully this helps someone else…

  5. Where does the IP of “cluster endpoint DNS name” come from? You’re using 172.29.20.5. My controlplane’s IP is 172.17.0.1. Should I use that?

  6. Hey Everyone,

    Great article, however I have a question where are the steps for installing kubenetes on the worker nodes do you follow the same procedure as the Controller Node? Were do you stop to get the worker node properly funtional?

    Sorry I am new to kubernetes
    Thanks,
    Michael

  7. Following your instructions completely using the CRIO install, no pods can access the internet. The master comes up fine and the nodes come up fine and I can do deployments and everything.

    pods just can not access the internet. I tried to setup jenkins and it was not able to access the internet to install the plugins

    Something is flawed in your instructions

    • Hi Chris.

      I noted the issue was with CRIO subnet defined in /etc/cni/net.d/100-crio-bridge.conf config file. It is different from Pod network defined when bootstrapping k8s cluster.

      You can retry with updated article. Basically change the subnet with command below:

      sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf

      Then restart crio service and bootstrap cluster. I hope this helps.

      • my home network is already running on 192.168.0 subnet. I thought you could not use a pod network the same as an already in use subnet?

      • sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf
        sed: can’t read /etc/cni/net.d/100-crio-bridge.conf: No such file or directory

        root@kube-master:~# more /etc/cni/net.d/100-crio-bridge.conf
        more: stat of /etc/cni/net.d/100-crio-bridge.conf failed: No such file or directory

  8. I tried all 3 container management types in this tutorial and none of the pods can access the internet. They can talk to each other but nothing in the outside world. I installed Jenkins in kubernetes and I need to install the plugins but can’t because if this issue. Please advise

  9. I am confused by setting the IP for cluster endpoint.
    The below is my network. What IP can i use for endpoint?

    4: docker0: mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:71:4d:87:43 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
    valid_lft forever preferred_lft forever

  10. Clear article!

    But when I run:
    sudo kubeadm config images pull –cri-socket /var/run/docker.sock

    Then I get this response:
    failed to pull image “k8s.gcr.io/kube-apiserver:v1.22.4″: output: time=”2021-12-02T20:00:52Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher

    Does anyone know how to fix this?

    All the previous steps were executed succesfully.

      • Hi Josphat, Thank for writing this article. But I am facing similiar issue. Docker service is up for me but still not able to connect to this CRI-socket

        user “test” has been added to group docker and changes applied with “newGrp docker” command.

        K8s Version : 1.22
        test@l2030017515:~$ docker version
        Client: Docker Engine – Community
        Version: 19.03.15
        API version: 1.40
        Go version: go1.13.15
        Git commit: 99e3ed8919
        Built: Sat Jan 30 03:17:01 2021
        OS/Arch: linux/amd64
        Experimental: false

        Server: Docker Engine – Community
        Engine:
        Version: 19.03.15
        API version: 1.40 (minimum version 1.12)
        Go version: go1.13.15
        Git commit: 99e3ed8919
        Built: Sat Jan 30 03:15:30 2021
        OS/Arch: linux/amd64
        Experimental: false
        containerd:
        Version: 1.4.13
        GitCommit: 9cc61520f4cd876b86e77edfeb88fbcd536d1f9d
        runc:
        Version: 1.0.3
        GitCommit: v1.0.3-0-gf46b6ba
        docker-init:
        Version: 0.18.0
        GitCommit: fec3683
        test@l2030017515:~$

        test@l2030017515:~$ sudo systemctl status docker
        ● docker.service – Docker Application Container Engine
        Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
        Active: active (running) since Tue 2022-03-08 13:23:10 IST; 18s ago
        TriggeredBy: ● docker.socket
        Docs: https://docs.docker.com
        Main PID: 589981 (dockerd)
        Tasks: 14
        Memory: 39.1M
        CGroup: /system.slice/docker.service
        └─589981 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock

        Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549821158+05:30″ level=warning msg=”Your kernel does not support cgroup rt runtime”
        Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549833221+05:30″ level=warning msg=”Your kernel does not support cgroup blkio weight”
        Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549844854+05:30″ level=warning msg=”Your kernel does not support cgroup blkio weight_device”
        Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.550067148+05:30″ level=info msg=”Loading containers: start.”
        Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.930384654+05:30″ level=info msg=”Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option>
        Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.065481959+05:30″ level=info msg=”Loading containers: done.”
        Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.431063768+05:30″ level=info msg=”Docker daemon” commit=99e3ed8919 graphdriver(s)=overlay2 version=19.03.15
        Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.431230178+05:30″ level=info msg=”Daemon has completed initialization”
        Mar 08 13:23:10 l2030017515 systemd[1]: Started Docker Application Container Engine.
        Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.602166953+05:30″ level=info msg=”API listen on /run/docker.sock”

        Error :

        test@l2030017515:~$ sudo kubeadm init –apiserver-advertise-address=$IP –pod-network-cidr=192.168.0.0/16 –cri-socket /run/docker.sock –upload-certs
        I0308 13:23:46.930188 590324 version.go:255] remote version is much newer: v1.23.4; falling back to: stable-1.22
        [init] Using Kubernetes version: v1.22.7
        [preflight] Running pre-flight checks
        error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR CRI]: container runtime is not running: output: time=”2022-03-08T13:23:52+05:30″ level=fatal msg=”connect: connect endpoint ‘unix:///run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
        , error: exit status 1
        [preflight] If you know what you are doing, you can make a check non-fatal with `–ignore-preflight-errors=…`
        To see the stack trace of this error execute with –v=5 or higher

        Would be great of you suggest a resolution for the same.

        br

      • sysadmin@sysadmin-virtual-machine:~$ systemctl status docker
        ● docker.service – Docker Application Container Engine
        Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
        Active: active (running) since Fri 2022-04-15 13:56:06 WIB; 22min ago
        TriggeredBy: ● docker.socket
        Docs: https://docs.docker.com
        Main PID: 8624 (dockerd)
        Tasks: 19
        Memory: 44.3M
        CGroup: /system.slice/docker.service
        └─8624 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock

        Apr 15 13:56:18 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:18.628776532+07:00″ level=info msg=”ignoring event” container=97a7af489a4131d6ccd4022ab9>
        Apr 15 13:56:21 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:21.884168954+07:00″ level=info msg=”ignoring event” container=e98e6eefe307e1161257876fbf>
        Apr 15 13:56:21 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:21.920915826+07:00″ level=info msg=”ignoring event” container=7b98046840f84ac9f6148e3670>
        Apr 15 13:56:25 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:25.571995048+07:00″ level=info msg=”ignoring event” container=b27cd9e8045fdb6145b4afd634>
        Apr 15 13:56:25 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:25.614309354+07:00″ level=info msg=”ignoring event” container=8a11505137a45fddf3df7c2cec>
        Apr 15 13:56:28 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:28.878086757+07:00″ level=info msg=”ignoring event” container=4ec09c5edf4939b4f567b82d0d>
        Apr 15 13:56:29 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:29.892410616+07:00″ level=info msg=”ignoring event” container=00207f98f2cb6276e58d9509bc>
        Apr 15 13:56:31 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:31.952429432+07:00″ level=info msg=”ignoring event” container=f218022d2f2c50fb3a63401f29>
        Apr 15 13:56:32 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:32.992268870+07:00″ level=info msg=”ignoring event” container=a29f524abb16b085cdbd4af591>
        Apr 15 13:56:35 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:35.043274360+07:00″ level=info msg=”ignoring event” container=8c3a0f063ddcfc5e4f2cb606a3>
        lines 1-21/21 (END)
        ^C
        sysadmin@sysadmin-virtual-machine:~$ sudo kubeadm config images pull –cri-socket /var/run/docker.sock
        [sudo] password for sysadmin:
        failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.5″: output: time=”2022-04-15T14:19:22+07:00″ level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
        , error: exit status 1
        To see the stack trace of this error execute with –v=5 or higher
        sysadmin@sysadmin-virtual-machine:~$

        still same problem, docker status “running”

  11. The certificate created in the init command is invalid and flagged by Chrome/Safari when trying to progress through step 10 installing the dashboard.

  12. I ran into this problem. See console log, docker is installed and running:

    jacob@lazykub1:~$ sudo kubeadm config images pull –cri-socket /var/run/docker.sock
    failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.1″: output: time=”2022-01-16T14:14:45Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher
    jacob@lazykub1:~$ sudo systemctl status docker
    ● docker.service – Docker Application Container Engine
    Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
    Active: active (running) since Sun 2022-01-16 14:14:21 UTC; 3min 23s ago
    TriggeredBy: ● docker.socket
    Docs: https://docs.docker.com
    Main PID: 4187 (dockerd)
    Tasks: 8
    Memory: 33.7M
    CGroup: /system.slice/docker.service
    └─4187 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock

    Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947413897Z” level=warning msg=”Your kernel does not support CPU realtime scheduler”
    Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947421586Z” level=warning msg=”Your kernel does not support cgroup blkio weight”
    Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947428108Z” level=warning msg=”Your kernel does not support cgroup blkio weight_device”
    Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947609517Z” level=info msg=”Loading containers: start.”
    Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.077788214Z” level=info msg=”Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option –bip can be used to set a preferred IP address”
    Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.111888405Z” level=info msg=”Loading containers: done.”
    Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.133944179Z” level=info msg=”Docker daemon” commit=459d0df graphdriver(s)=overlay2 version=20.10.12
    Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.134263733Z” level=info msg=”Daemon has completed initialization”
    Jan 16 14:14:21 lazykub1 systemd[1]: Started Docker Application Container Engine.
    Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.154710559Z” level=info msg=”API listen on /run/docker.sock”
    jacob@lazykub1:~$

  13. swap statement is not working for me.
    “sudo sed -i ‘/ swap / s/^\(.*\)$/#\1/g’ /etc/fstab”
    It does not comment out the swap line

    # /etc/fstab: static file system information.
    #
    # Use ‘blkid’ to print the universally unique identifier for a
    # device; this may be used with UUID= as a more robust way to name devices
    # that works even if disks are added and removed. See fstab(5).
    #
    #
    # / was on /dev/ubuntu-vg/ubuntu-lv during curtin installation
    /dev/disk/by-id/dm-uuid-LVM-LMZ6lL63Lq08X4WdpxYG1q2BXTyEcLs2cS2oIdfaWHUALXM3rHgG4j26DG8Hys2M / ext4 defaults 0 1
    # /boot was on /dev/sda2 during curtin installation
    /dev/disk/by-uuid/d13a56c4-40f8-4994-8b23-06b8f2785f5d /boot ext4 defaults 0 1
    /swap.img none swap sw 0 0

  14. Thank you for the tutorial!!!

    After some tries finally seems I got it working, now I’m looking how to integrate on GitLab.

    An quick hint for those with problems on coredns failing or dont get ready.

    Don’t change the range IP, post says its possible to customize and use seed for change after install, all time I have did that, pod never started, reading the file:

    https://docs.projectcalico.org/manifests/custom-resources.yaml

    There’s the note:

    # Note: The ipPools section cannot be modified post-install.

    And the default range is 192.168.0.0/16

  15. sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf
    sed: can’t read /etc/cni/net.d/100-crio-bridge.conf: No such file or directory

  16. Also seeing the following error which then stops anything further on working:

    sudo kubeadm config images pull –cri-socket /var/run/docker.sock
    failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.3″: output: time=”2022-01-31T16:26:47Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher

    any ideas?

  17. Hello
    I have an issue with this tutorial : My install is in 1.24.0. All works good but i have a problem with tokens in service account. Any token is generated when i create sa, and i have no token in the default sa. Can you help me ?

    Thx

  18. Hello, i have an issue with this guide.
    I have successfully installed my cluster with containerd and calico.
    Every works good except one thing :
    I haven’t any token generated for services account. When i type this command :
    kubectl get sa
    The default sa is present but with 0 secrets. The token section is set to
    It’s similar when i tried to create a new sa, no token.
    Can i have help please ?

    Thanks

  19. IF you need to specify a different pod cidr as “192.168.0.0/16” you need to edit the custom-resources.yaml for calico before applying.

  20. Hi,
    A great post, which helped me a lot.
    I have been trying to host k8 on AWS EC2 Instances ( an task )

    i have encountered two issues

    1# at worker node

    Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the ‘criSocket’ field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
    To see the stack trace of this error execute with –v=5 or higher

    for this i have tried
    sudo kubeadm join {ip address} –token xgr4i8.kokoqjxo6yvs3d19 –discovery-token-ca-cert-hash sha256:d1b4e8ea7a87ae6b91198a2c0ac85663c48b2f2f9181d7075d13294336684379
    –cri-socket /run/cri-dockerd.sock

    but my command is being struck

    2#
    [preflight] Running pre-flight checks
    [WARNING SystemVerification]: missing optional cgroups: blkio

    with above message its been stuck, help me please

  21. I installed the K8s cluster (1 master, 2 worker) using this guide. I used docker as Container Runtime. However, when I create Pods, the Pods in different worker nodes take overlapping Ips. Pod from one node can’t talk to pod from another node.

  22. Thank you very much.
    If I’d add something to make this perfect is to add the details on how to generate the join token to add new workers.

  23. I followed this guide (docker runtime) with an additional step for Calico:

    kubectl apply -f https://projectcalico.docs.tigera.io/v3.23/manifests/calico.yaml

    An outstanding issue I have is even though I specified a pod network with kubeadm init, the pods are getting assigned addresses from the docker0 network (172.17.0.0/16) instead of the pod network.

    Any ideas how to resolve? I found a few similar issues on Stackoverflow but haven’t been able to resolve mine.

  24. I have problem about docker container cannot access external calls.
    I build a simple application which do api call and it says “resource temporarily unavailable”, after trying to exec into container and do apt-get update it says that cannot connect to deb.
    Any help about this ? i setup cluster as described above.

  25. Very nice article !
    I did manage to setup k8s but recently i started getting errors in my pods “resource temporarily unavailable” when doing external http call, anyone have some solution ?

    btw, i tried to exec into container and do “apt-get update” but it says that i cannot connect to deb.

  26. After installing the CRI-O runtime and using Calico as suggested, my calico-kube-controller is stuck in a Pending status.

    NAMESPACE NAME READY STATUS RESTARTS AGE
    calico-system calico-kube-controllers-657d56796-bvh25 0/1 Pending 0 30m
    calico-system calico-node-tdkjf 1/1 Running 0 30m
    calico-system calico-typha-89d87cb5c-fn887 1/1 Running 0 30m
    kube-system coredns-6d4b75cb6d-dbbmk 1/1 Running 0 31m
    kube-system coredns-6d4b75cb6d-twlkk 1/1 Running 0 31m
    kube-system etcd-res-u20-template 1/1 Running 0 32m
    kube-system kube-apiserver-res-u20-template 1/1 Running 0 32m
    kube-system kube-controller-manager-res-u20-template 1/1 Running 0 32m
    kube-system kube-proxy-cxmvt 1/1 Running 0 31m
    kube-system kube-scheduler-res-u20-template 1/1 Running 0 32m
    tigera-operator tigera-operator-6995cc5df5-4w9rt 1/1 Running 0 30m

    The only thing in the log that drew my attention was
    “E0808 23:12:25.258702 1 disruption.go:534] Error syncing PodDisruptionBudget calico-system/calico-typha, requeuing: Operation cannot be fulfilled on poddisruptionbudgets.policy “calico-typha”: the object has been modified; please apply your changes to the latest version and try again”

    Has anyone seen this issue? suggestions?

  27. After installing the CRI-O runtime and the Calico network plugin, the calico-kube-controllers stay in a `Pending` state.

    senften@res-k8s-master:~$ kubectl get pods –all-namespaces
    NAMESPACE NAME READY STATUS RESTARTS AGE
    calico-system calico-kube-controllers-657d56796-qzhxw 0/1 Pending 0 17h
    calico-system calico-node-62k9g 1/1 Running 0 17h
    calico-system calico-typha-8fdfcb55c-k6zcq 1/1 Running 0 17h
    kube-system coredns-6d4b75cb6d-bknbp 1/1 Running 0 17h
    kube-system coredns-6d4b75cb6d-jkfk7 1/1 Running 0 17h
    kube-system etcd-res-k8s-master 1/1 Running 0 17h
    kube-system kube-apiserver-res-k8s-master 1/1 Running 0 17h
    kube-system kube-controller-manager-res-k8s-master 1/1 Running 0 17h
    kube-system kube-proxy-jht5p 1/1 Running 0 17h
    kube-system kube-scheduler-res-k8s-master 1/1 Running 0 17h
    tigera-operator tigera-operator-6995cc5df5-j64s6 1/1 Running 0 17h

    I think it may be related to this bug, [calico-kube-controllers pending since not tolerate the “node-role.kubernetes.io/control-plane” · Issue #6087 · projectcalico/calico](https://github.com/projectcalico/calico/issues/6087), but, after reading through the issue, I am still uncertain how to proceed or even sure this is the issue I’ve run into.

    Any advice? Anyone else seeing this?
    Thanks

  28. While running this command
    sudo kubeadm config images pull –cri-socket unix:///run/cri-dockerd.sock
    i m getting this error
    failed to pull image “registry.k8s.io/kube-apiserver:v1.25.2″: output: time=”2022-10-13T11:52:55+05:30″ level=fatal msg=”unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /run/cri-dockerd.sock: connect: no such file or directory\””
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher

  29. i don’t know why i m not able to create file for docker at the given file i m following all steps
    Whenever i run this command sudo kubeadm config images pull –cri-socket unix:///run/cri-dockerd.sock
    i m getting this following error failed to pull image “registry.k8s.io/kube-apiserver:v1.25.2″: output: time=”2022-10-13T12:29:44+05:30″ level=fatal msg=”unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /run/cri-dockerd.sock: connect: no such file or directory\””
    , error: exit status 1
    To see the stack trace of this error execute with –v=5 or higher

    Any solution ??

  30. thanks for ur blogs its helps most of them and reilf more stress .
    Here im beginer to these .whats the use of docker . kubernates run upon or docker , or docker container runs inside the pod ? plzzz tell me im confused about all these help me out plzzz

  31. I followed your guide :

    Host OS: Ubuntu 22.04
    CNI and version: Flannel latest
    CRI and version: CRI-O latest

    kubeadm init –v=5 –pod-network-cidr=10.244.0.0/16 –upload-certs –control-plane-endpoint=k8s.mydomain.com

    The cluster initialized successfully but coredns pods cannot started

    kubectl version –short
    Client Version: v1.26.2
    Kustomize Version: v4.5.7
    Server Version: v1.26.2

    kubectl cluster-info
    Kubernetes control plane is running at https://k8s.mydomain.com:6443
    CoreDNS is running at https://k8s.mydomain.com:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

    kubectl get pod -n kube-system
    NAME READY STATUS RESTARTS AGE
    coredns-787d4945fb-g8hhj 0/1 CreateContainerError 0 7m1s
    coredns-787d4945fb-nkssp 0/1 CreateContainerError 0 6m29s
    etcd-k8s-52ts-master1 1/1 Running 1 12h
    kube-apiserver-k8s-52ts-master1 1/1 Running 1 12h
    kube-controller-manager-k8s-52ts-master1 1/1 Running 1 12h
    kube-proxy-m4mhc 1/1 Running 1 12h
    kube-proxy-rrdpb 1/1 Running 0 12h
    kube-scheduler-k8s-52ts-master1 1/1 Running 1 12h

    kubectl describe pod coredns-787d4945fb-g8hhj -n kube-system
    Events:
    Type Reason Age From Message
    —- —— —- —- ——-
    Normal Scheduled 65s default-scheduler Successfully assigned kube-system/coredns-787d4945fb-g8hhj to k8s-52ts-worker1
    Warning Failed 64s kubelet Error: container create failed: time=”2023-03-17T08:26:01+07:00″ level=warning msg=”unable to get oom kill count” error=”openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-105d630363803602c6bfe9c1516b128192b95c519ff321ed1177fe0cacdc9b42.scope/memory.events: no such file or directory”
    time=”2023-03-17T08:26:01+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
    Warning Failed 63s kubelet Error: container create failed: time=”2023-03-17T08:26:02+07:00″ level=warning msg=”unable to get oom kill count” error=”openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-2bbebc0f6e74a3c5a8f5e460e758e322a94ffe97964f34e357d831ce3fced345.scope/memory.events: no such file or directory”
    time=”2023-03-17T08:26:02+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
    Warning Failed 47s kubelet Error: container create failed: time=”2023-03-17T08:26:18+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
    Warning Failed 34s kubelet Error: container create failed: time=”2023-03-17T08:26:31+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
    Warning Failed 19s kubelet Error: container create failed: time=”2023-03-17T08:26:46+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
    Normal Pulled 9s (x6 over 64s) kubelet Container image “registry.k8s.io/coredns/coredns:v1.9.3” already present on machine

  32. Hi ,
    Thanks for this great tutorial, It’s really useful and helped me to bootstrap multiple clusters since 2022.
    I want to mention two points and hope that it will help someone.
    1 – For containerd, the old debian package 1.4 does not work with v1.25 and higher. the version that works it 1.6 . Here is one of the related errors :
    failed to pull image “registry.k8s.io/kube-apiserver:v1.25.10″: output: time=”2023-05-21T10:16:39Z” level=fatal msg=”validate service connection: CRI v1 image API is not implemented for endpoint \”unix:///var/run/containerd/containerd.sock\”: rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService”

    2 – The Urls on calico setup are not working anymore , the updated version could be found here :
    https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises

  33. Hi Josphat
    Thanks for this guide, very detail and helpful.
    Adding one more thing to the Calico installation part
    curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O
    as the resources.yaml seems using 192.168.0.0/16 as default cidr, but your cluster init command above is using –pod-network-cidr=172.24.0.0/16
    If one want to deploy Calico network plugin, he must manually update the cidr in this yaml file to have
    cidr: 172.24.0.0/16

LEAVE A REPLY

Please enter your comment!
Please enter your name here