This guide walks you through deploying a production-ready Kubernetes 1.32 cluster on Ubuntu 24.04 LTS using kubeadm. We will configure containerd as the container runtime, set up networking with Calico CNI, and verify the cluster with a test workload. Every command is tested on fresh Ubuntu 24.04 (Noble Numbat) minimal server installs.
Ubuntu 24.04 ships with a modern kernel (6.8+), systemd-resolved by default, and uses APT with signed-by keyrings for repository management. This guide is written specifically for the Debian/Ubuntu toolchain – if you are running Rocky Linux or AlmaLinux, refer to our Kubernetes cluster setup guide for RHEL-based systems instead.
Prerequisites
You need three Ubuntu 24.04 LTS servers with the following minimum specifications:
| Hostname | Role | IP Address | RAM | CPU |
|---|---|---|---|---|
| k8s-master01 | Control Plane | 192.168.1.10 | 4 GB | 2 vCPU |
| k8s-worker01 | Worker | 192.168.1.11 | 4 GB | 2 vCPU |
| k8s-worker02 | Worker | 192.168.1.12 | 4 GB | 2 vCPU |
All three nodes must have unique hostnames, MAC addresses, and product_uuid values. Ensure each node can reach the others over the network and has unrestricted outbound internet access for pulling images and packages.
Set Hostnames on All Nodes
Run the appropriate command on each node:
sudo hostnamectl set-hostname k8s-master01 # On control plane
sudo hostnamectl set-hostname k8s-worker01 # On first worker
sudo hostnamectl set-hostname k8s-worker02 # On second worker
Add all nodes to /etc/hosts on every machine so they can resolve each other by name:
cat <<EOF | sudo tee -a /etc/hosts
192.168.1.10 k8s-master01
192.168.1.11 k8s-worker01
192.168.1.12 k8s-worker02
EOF
Disable Swap
Kubelet refuses to start if swap is active. Ubuntu 24.04 may have a swap file enabled by default. Disable it permanently:
sudo swapoff -a
sudo sed -i '/\sswap\s/s/^/#/' /etc/fstab
Verify swap is off with free -h – the swap row should show all zeros.
Load Kernel Modules and Set Sysctl Parameters
Kubernetes networking requires the overlay and br_netfilter kernel modules. Create a configuration file so they load on every boot:
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
Next, configure the required sysctl parameters for packet forwarding and bridge traffic:
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Confirm the values are applied:
sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
All three should return 1.
Step 1 – Install containerd Container Runtime
Perform these steps on all three nodes. We install containerd from Docker’s official apt repository because it ships a more current version than the Ubuntu universe repository and receives faster security patches.
Start by installing prerequisite packages and adding Docker’s GPG key:
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
Add the Docker repository. Note that Ubuntu 24.04 uses the noble codename:
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Install only the containerd.io package – you do not need the full Docker Engine for Kubernetes:
sudo apt-get update
sudo apt-get install -y containerd.io
Configure containerd to Use systemd Cgroup Driver
Kubernetes 1.32 requires the systemd cgroup driver. Generate the default containerd configuration and enable it:
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
Restart and enable containerd:
sudo systemctl restart containerd
sudo systemctl enable containerd
Verify it is running:
sudo systemctl status containerd
The output should show active (running). If containerd fails to start, check the config file syntax with containerd config dump.
Step 2 – Install kubeadm, kubelet, and kubectl
Run these commands on all three nodes. The Kubernetes project hosts its own apt repository at pkgs.k8s.io with version-specific channels. We will use the v1.32 channel.
Add the Kubernetes apt signing key and repository:
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.32/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.32/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
Install the packages and hold them at the current version to prevent unintended upgrades breaking your cluster:
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
Enable the kubelet service so it starts automatically after a reboot:
sudo systemctl enable kubelet
At this point kubelet will be crash-looping – that is expected. It will stabilize once kubeadm init completes and provides the necessary configuration.
Step 3 – Configure UFW Firewall Rules
Ubuntu uses UFW (Uncomplicated Firewall) as its default firewall management tool, not firewalld. If UFW is active on your nodes, you need to open the required Kubernetes ports.
Control Plane Node Firewall Rules
Run these on k8s-master01 only:
sudo ufw allow 6443/tcp # Kubernetes API server
sudo ufw allow 2379:2380/tcp # etcd server client API
sudo ufw allow 10250/tcp # Kubelet API
sudo ufw allow 10259/tcp # kube-scheduler
sudo ufw allow 10257/tcp # kube-controller-manager
sudo ufw allow 179/tcp # Calico BGP
sudo ufw allow 4789/udp # Calico VXLAN
sudo ufw reload
Worker Node Firewall Rules
Run these on k8s-worker01 and k8s-worker02:
sudo ufw allow 10250/tcp # Kubelet API
sudo ufw allow 10256/tcp # kube-proxy
sudo ufw allow 30000:32767/tcp # NodePort Services
sudo ufw allow 179/tcp # Calico BGP
sudo ufw allow 4789/udp # Calico VXLAN
sudo ufw reload
If UFW is not enabled on your nodes (check with sudo ufw status), you can skip this section. Many lab environments run without a host firewall, but production deployments should have these rules in place.
Step 4 – Initialize the Control Plane with kubeadm
Run this on the control plane node only (k8s-master01). We specify --pod-network-cidr=192.168.0.0/16 because that is the default CIDR that Calico expects. If your node LAN already uses the 192.168.0.0/16 range, change the pod CIDR to something else like 10.244.0.0/16 and adjust the Calico configuration accordingly.
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --kubernetes-version=stable-1.32
The initialization takes a few minutes. When it finishes, you will see output containing a kubeadm join command with a token. Copy and save this entire command – you will need it to join the worker nodes.
Set up kubectl access for your regular user account:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Verify the control plane is responding:
kubectl cluster-info
You should see the Kubernetes control plane address and CoreDNS listed. The nodes will show as NotReady until the CNI plugin is installed in the next step.
Step 5 – Install Calico CNI Plugin
Without a CNI plugin, pods cannot communicate across nodes and CoreDNS pods will stay in Pending state. We use Calico because it provides both networking and network policy enforcement. For a deeper look at Kubernetes networking options, see our comparison of Kubernetes CNI plugins.
Install the Tigera Calico operator and custom resource definitions:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.2/manifests/tigera-operator.yaml
Download the custom resources manifest and apply it. If you used the default pod CIDR (192.168.0.0/16), no edits are needed:
kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.2/manifests/custom-resources.yaml
Watch the Calico pods come up in the calico-system namespace:
kubectl get pods -n calico-system -w
Wait until all Calico pods show Running status. This usually takes two to three minutes as the images are pulled. Once Calico is operational, check that the control plane node is now Ready:
kubectl get nodes
Step 6 – Join Worker Nodes to the Cluster
Run the kubeadm join command that was printed at the end of kubeadm init on each worker node. The command looks like this:
sudo kubeadm join 192.168.1.10:6443 --token <your-token> --discovery-token-ca-cert-hash sha256:<your-hash>
If you lost the join command or the token has expired (tokens are valid for 24 hours), generate a new one from the control plane:
kubeadm token create --print-join-command
After joining both workers, verify the full cluster from the control plane:
kubectl get nodes -o wide
All three nodes should show Ready status. The INTERNAL-IP column should display each node’s LAN address, and the CONTAINER-RUNTIME column should show containerd:// followed by the version.
Step 7 – Deploy a Test Application
Create an nginx deployment with two replicas and expose it through a NodePort service to verify cross-node pod communication:
kubectl create deployment nginx-test --image=nginx:latest --replicas=2
kubectl expose deployment nginx-test --type=NodePort --port=80
Check that the pods are scheduled across different worker nodes:
kubectl get pods -o wide
Find the assigned NodePort:
kubectl get svc nginx-test
The PORT(S) column will show something like 80:31234/TCP. You can access the nginx welcome page by opening http://192.168.1.11:31234 or any node IP with that port in a browser. If the page loads, your cluster networking is fully operational.
Clean up the test deployment when done:
kubectl delete deployment nginx-test
kubectl delete svc nginx-test
Step 8 – Install Kubernetes Dashboard (Optional)
The Kubernetes Dashboard provides a web-based UI for managing workloads, viewing logs, and monitoring cluster resources. It is entirely optional but helpful for teams that prefer a graphical interface. For a complete walkthrough, see our dedicated Kubernetes Dashboard installation guide.
Deploy the dashboard using Helm (the recommended method for Dashboard v7+):
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
Create a service account and cluster role binding for dashboard access:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
EOF
Generate a login token:
kubectl -n kubernetes-dashboard create token admin-user
Access the dashboard by running kubectl proxy from your workstation and opening the dashboard URL in your browser, or by exposing it via a NodePort or Ingress depending on your environment.
Install metrics-server
The kubectl top command and the Dashboard’s resource graphs require metrics-server. Install it with:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
In self-signed or internal certificate environments, you may need to add the --kubelet-insecure-tls argument to the metrics-server deployment:
kubectl -n kube-system patch deployment metrics-server --type='json' -p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'
After a minute or two, verify that metrics are available:
kubectl top nodes
kubectl top pods -A
Troubleshooting Common Issues
kubelet Fails to Start
Check the kubelet logs for specific errors:
sudo journalctl -u kubelet -f
The most common cause on Ubuntu 24.04 is swap still being active. Confirm with free -h and make sure the /etc/fstab swap entry is commented out.
Nodes Stuck in NotReady State
This almost always means the CNI plugin is not installed or not running correctly. Check the Calico pods:
kubectl get pods -n calico-system
kubectl get pods -n tigera-operator
If Calico pods are in CrashLoopBackOff or ImagePullBackOff, check that your nodes have internet access and can reach quay.io and docker.io container registries.
CoreDNS Pods Stuck in Pending
CoreDNS will not schedule until a CNI is installed. If you have Calico running and CoreDNS is still Pending, describe the pods to check for scheduling issues:
kubectl describe pods -n kube-system -l k8s-app=kube-dns
kubeadm init Fails with Port Already in Use
If you are re-running kubeadm init after a failed attempt, reset the node first:
sudo kubeadm reset -f
sudo rm -rf /etc/cni/net.d
sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X
Then run kubeadm init again.
Token Expired When Joining Workers
Bootstrap tokens created during kubeadm init expire after 24 hours. Generate a fresh token and join command:
kubeadm token create --print-join-command
containerd Fails After Config Changes
If you edited /etc/containerd/config.toml and containerd will not start, regenerate a clean config and reapply the systemd cgroup change:
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
sudo systemctl restart containerd
UFW Blocking Cluster Communication
If nodes appear Ready but pods cannot communicate across nodes, UFW may be silently dropping traffic. Temporarily disable UFW to test:
sudo ufw disable
If the problem resolves, re-enable UFW and add the missing rules from Step 3. Pay attention to the Calico VXLAN port (4789/udp) and BGP port (179/tcp) – these are frequently forgotten.
Summary
You now have a fully functional three-node Kubernetes 1.32 cluster running on Ubuntu 24.04 LTS with containerd as the runtime and Calico handling pod networking. From here you can deploy workloads, configure Ingress controllers, set up persistent storage, or add more worker nodes by running the kubeadm join command on additional Ubuntu machines.
Key points to remember for ongoing maintenance:
- Kubernetes packages are held with
apt-mark hold– when you are ready to upgrade, unhold them, update the repository channel to the next minor version, and follow the official kubeadm upgrade procedure. - The containerd configuration at
/etc/containerd/config.tomlmust keepSystemdCgroup = true– resetting this will break your cluster. - Calico’s operator-based deployment makes upgrades straightforward – update the operator manifest URL to the newer version and reapply.
- Back up your etcd data regularly. On a single control plane setup, losing the master node means losing the entire cluster state.

































































Several issues in these steps. Trying to follow with containerd setup – an example error is no step to create /var/lib/kubelet/config.yaml. Another, it enters into a root bash but doesn’t exit. And kubeadm config images pull leads to ‘unknown service runtime.v1alpha2.ImageService’ error.
This not working:
sudo kubeadm config images pull
failed to pull image “k8s.gcr.io/kube-apiserver:v1.20.5″: output: time=”2021-03-25T12:37:30Z” level=fatal msg=”pulling image failed: rpc error: code = Unimplemented desc = unknown service runtime.v1alpha2.ImageService”
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
Hi Guido, Because the command install kubelet doesn’t determine the version, if you follow command of this guide, it will install the last version of kubelet, it doesn’t support on Ubuntu 20.4.
Replace command install kubelet, kubeadm, kubectl as below:
apt update && apt install -y vim git curl wget kubeadm=1.22.1-00 kubelet=1.22.1-00 kubectl=1.22.1-00
Hope it’s helpful for you.
Hi Guido, Because the command install kubelet doesn’t determine the version, if you follow command of this guide, it will install the last version of kubelet, it doesn’t support on Ubuntu 20.4.
Replace command install kubelet, kubeadm, kubectl as below:
apt update && apt install -y vim git curl wget kubeadm=1.22.1-00 kubelet=1.22.1-00 kubectl=1.22.1-00
Hope it is helpful for you.
Its working. thanks.
Awesome working example! Thanks.
Welcome!
after reproducing the steps mentioned I found out that kubectl get nodes does not show the nodes
and shows the error The connection to the server localhost:8080 was refused – did you specify the right host or port?
Any help asap would be highly appreciated
Follow below commands and execute the “kubectl get nodes” using non-root user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
To install Calico I had to:
curl https://docs.projectcalico.org/manifests/calico.yaml -O
kubectl apply -f calico.yaml
Steps above did not work.
Any error when using steps in this guide?
yes
root@k-master:~# sudo apt install cri-o cri-o-runc
Reading package lists… Done
Building dependency tree
Reading state information… Done
E: Unable to locate package cri-o
kindly check
Did you follow steps under # Add Cri-o repo?
I followed using containerd and looks like @ihc have right. With his help i was able to run cluster
I just came here to say, Thank you ihc. Your comment saved me headche.
Some issues getting this to work. Here are the items I changed to get it going:
First, there was an issue getting containerd to start. The issue was that the config was not saved properly. In the “Installing containerd” section above, there was a missing `>` in the following command (I really hope my pre and /pre tags work):
containerd config default /etc/containerd/config.toml
It needs to be:
containerd config default > /etc/containerd/config.toml
Then there was an issue with flannel but that resolved itself:
https://github.com/flannel-io/flannel/issues/1482
Then, once the control plane node was up, the worker nodes would not go to ready state. I resolved this by setting the containerd.default_runtime (Again, I really hope my pre and /pre tags work):
[plugins.”io.containerd.grpc.v1.cri”.containerd.default_runtime]
runtime_type = “io.containerd.runtime.v1.linux”
Hopefully this helps someone else…
Thanks for the perfect working example and it worked for me.
Where does the IP of “cluster endpoint DNS name” come from? You’re using 172.29.20.5. My controlplane’s IP is 172.17.0.1. Should I use that?
Hey,
It will be an A record pointing to 172.17.0.1.
Hey Everyone,
Great article, however I have a question where are the steps for installing kubenetes on the worker nodes do you follow the same procedure as the Controller Node? Were do you stop to get the worker node properly funtional?
Sorry I am new to kubernetes
Thanks,
Michael
Hi,
It is captured in step 7.
Following your instructions completely using the CRIO install, no pods can access the internet. The master comes up fine and the nodes come up fine and I can do deployments and everything.
pods just can not access the internet. I tried to setup jenkins and it was not able to access the internet to install the plugins
Something is flawed in your instructions
Hi Chris.
I noted the issue was with CRIO subnet defined in /etc/cni/net.d/100-crio-bridge.conf config file. It is different from Pod network defined when bootstrapping k8s cluster.
You can retry with updated article. Basically change the subnet with command below:
sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf
Then restart crio service and bootstrap cluster. I hope this helps.
my home network is already running on 192.168.0 subnet. I thought you could not use a pod network the same as an already in use subnet?
sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf
sed: can’t read /etc/cni/net.d/100-crio-bridge.conf: No such file or directory
root@kube-master:~# more /etc/cni/net.d/100-crio-bridge.conf
more: stat of /etc/cni/net.d/100-crio-bridge.conf failed: No such file or directory
I tried all 3 container management types in this tutorial and none of the pods can access the internet. They can talk to each other but nothing in the outside world. I installed Jenkins in kubernetes and I need to install the plugins but can’t because if this issue. Please advise
See previous comment
Thanks for the reply.
I see the same issue when I build with docker or containerd as well
I’ve been having the same issue too. I’m using docker and none of my worker nodes can access the internet.
I am confused by setting the IP for cluster endpoint.
The below is my network. What IP can i use for endpoint?
4: docker0: mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:71:4d:87:43 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
It is mapped to primary interface IP address of your control plane node.
Thank you very much! the guide is very helpful and i managed to setup the K8 cluster.
Clear article!
But when I run:
sudo kubeadm config images pull –cri-socket /var/run/docker.sock
Then I get this response:
failed to pull image “k8s.gcr.io/kube-apiserver:v1.22.4″: output: time=”2021-12-02T20:00:52Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
Does anyone know how to fix this?
All the previous steps were executed succesfully.
Please check if Docker service is up – systemctl status docker
Hi Josphat, Thank for writing this article. But I am facing similiar issue. Docker service is up for me but still not able to connect to this CRI-socket
user “test” has been added to group docker and changes applied with “newGrp docker” command.
K8s Version : 1.22
test@l2030017515:~$ docker version
Client: Docker Engine – Community
Version: 19.03.15
API version: 1.40
Go version: go1.13.15
Git commit: 99e3ed8919
Built: Sat Jan 30 03:17:01 2021
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine – Community
Engine:
Version: 19.03.15
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 99e3ed8919
Built: Sat Jan 30 03:15:30 2021
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.13
GitCommit: 9cc61520f4cd876b86e77edfeb88fbcd536d1f9d
runc:
Version: 1.0.3
GitCommit: v1.0.3-0-gf46b6ba
docker-init:
Version: 0.18.0
GitCommit: fec3683
test@l2030017515:~$
test@l2030017515:~$ sudo systemctl status docker
● docker.service – Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2022-03-08 13:23:10 IST; 18s ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 589981 (dockerd)
Tasks: 14
Memory: 39.1M
CGroup: /system.slice/docker.service
└─589981 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock
Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549821158+05:30″ level=warning msg=”Your kernel does not support cgroup rt runtime”
Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549833221+05:30″ level=warning msg=”Your kernel does not support cgroup blkio weight”
Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.549844854+05:30″ level=warning msg=”Your kernel does not support cgroup blkio weight_device”
Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.550067148+05:30″ level=info msg=”Loading containers: start.”
Mar 08 13:23:09 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:09.930384654+05:30″ level=info msg=”Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option>
Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.065481959+05:30″ level=info msg=”Loading containers: done.”
Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.431063768+05:30″ level=info msg=”Docker daemon” commit=99e3ed8919 graphdriver(s)=overlay2 version=19.03.15
Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.431230178+05:30″ level=info msg=”Daemon has completed initialization”
Mar 08 13:23:10 l2030017515 systemd[1]: Started Docker Application Container Engine.
Mar 08 13:23:10 l2030017515 dockerd[589981]: time=”2022-03-08T13:23:10.602166953+05:30″ level=info msg=”API listen on /run/docker.sock”
Error :
test@l2030017515:~$ sudo kubeadm init –apiserver-advertise-address=$IP –pod-network-cidr=192.168.0.0/16 –cri-socket /run/docker.sock –upload-certs
I0308 13:23:46.930188 590324 version.go:255] remote version is much newer: v1.23.4; falling back to: stable-1.22
[init] Using Kubernetes version: v1.22.7
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR CRI]: container runtime is not running: output: time=”2022-03-08T13:23:52+05:30″ level=fatal msg=”connect: connect endpoint ‘unix:///run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `–ignore-preflight-errors=…`
To see the stack trace of this error execute with –v=5 or higher
Would be great of you suggest a resolution for the same.
br
sysadmin@sysadmin-virtual-machine:~$ systemctl status docker
● docker.service – Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Fri 2022-04-15 13:56:06 WIB; 22min ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 8624 (dockerd)
Tasks: 19
Memory: 44.3M
CGroup: /system.slice/docker.service
└─8624 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock
Apr 15 13:56:18 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:18.628776532+07:00″ level=info msg=”ignoring event” container=97a7af489a4131d6ccd4022ab9>
Apr 15 13:56:21 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:21.884168954+07:00″ level=info msg=”ignoring event” container=e98e6eefe307e1161257876fbf>
Apr 15 13:56:21 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:21.920915826+07:00″ level=info msg=”ignoring event” container=7b98046840f84ac9f6148e3670>
Apr 15 13:56:25 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:25.571995048+07:00″ level=info msg=”ignoring event” container=b27cd9e8045fdb6145b4afd634>
Apr 15 13:56:25 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:25.614309354+07:00″ level=info msg=”ignoring event” container=8a11505137a45fddf3df7c2cec>
Apr 15 13:56:28 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:28.878086757+07:00″ level=info msg=”ignoring event” container=4ec09c5edf4939b4f567b82d0d>
Apr 15 13:56:29 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:29.892410616+07:00″ level=info msg=”ignoring event” container=00207f98f2cb6276e58d9509bc>
Apr 15 13:56:31 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:31.952429432+07:00″ level=info msg=”ignoring event” container=f218022d2f2c50fb3a63401f29>
Apr 15 13:56:32 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:32.992268870+07:00″ level=info msg=”ignoring event” container=a29f524abb16b085cdbd4af591>
Apr 15 13:56:35 sysadmin-virtual-machine dockerd[8624]: time=”2022-04-15T13:56:35.043274360+07:00″ level=info msg=”ignoring event” container=8c3a0f063ddcfc5e4f2cb606a3>
lines 1-21/21 (END)
^C
sysadmin@sysadmin-virtual-machine:~$ sudo kubeadm config images pull –cri-socket /var/run/docker.sock
[sudo] password for sysadmin:
failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.5″: output: time=”2022-04-15T14:19:22+07:00″ level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
sysadmin@sysadmin-virtual-machine:~$
still same problem, docker status “running”
If you only installed Docker runtime, you don’t need to run this command again. You just need sudo kubeadm config images pull
The certificate created in the init command is invalid and flagged by Chrome/Safari when trying to progress through step 10 installing the dashboard.
Woooowwwwwwww !!! You have been enriching the POST, I congratulate you and very complete … !!!
Thanks for the positive vibe you’re dropping our way. We’re humbled.
I ran into this problem. See console log, docker is installed and running:
jacob@lazykub1:~$ sudo kubeadm config images pull –cri-socket /var/run/docker.sock
failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.1″: output: time=”2022-01-16T14:14:45Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
jacob@lazykub1:~$ sudo systemctl status docker
● docker.service – Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: active (running) since Sun 2022-01-16 14:14:21 UTC; 3min 23s ago
TriggeredBy: ● docker.socket
Docs: https://docs.docker.com
Main PID: 4187 (dockerd)
Tasks: 8
Memory: 33.7M
CGroup: /system.slice/docker.service
└─4187 /usr/bin/dockerd -H fd:// –containerd=/run/containerd/containerd.sock
Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947413897Z” level=warning msg=”Your kernel does not support CPU realtime scheduler”
Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947421586Z” level=warning msg=”Your kernel does not support cgroup blkio weight”
Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947428108Z” level=warning msg=”Your kernel does not support cgroup blkio weight_device”
Jan 16 14:14:20 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:20.947609517Z” level=info msg=”Loading containers: start.”
Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.077788214Z” level=info msg=”Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option –bip can be used to set a preferred IP address”
Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.111888405Z” level=info msg=”Loading containers: done.”
Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.133944179Z” level=info msg=”Docker daemon” commit=459d0df graphdriver(s)=overlay2 version=20.10.12
Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.134263733Z” level=info msg=”Daemon has completed initialization”
Jan 16 14:14:21 lazykub1 systemd[1]: Started Docker Application Container Engine.
Jan 16 14:14:21 lazykub1 dockerd[4187]: time=”2022-01-16T14:14:21.154710559Z” level=info msg=”API listen on /run/docker.sock”
jacob@lazykub1:~$
are you manage to solve your problem? iam also facing same problem
I see your socket path is /run/containerd/containerd.sock
swap statement is not working for me.
“sudo sed -i ‘/ swap / s/^\(.*\)$/#\1/g’ /etc/fstab”
It does not comment out the swap line
# /etc/fstab: static file system information.
#
# Use ‘blkid’ to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
#
# / was on /dev/ubuntu-vg/ubuntu-lv during curtin installation
/dev/disk/by-id/dm-uuid-LVM-LMZ6lL63Lq08X4WdpxYG1q2BXTyEcLs2cS2oIdfaWHUALXM3rHgG4j26DG8Hys2M / ext4 defaults 0 1
# /boot was on /dev/sda2 during curtin installation
/dev/disk/by-uuid/d13a56c4-40f8-4994-8b23-06b8f2785f5d /boot ext4 defaults 0 1
/swap.img none swap sw 0 0
Hi,
What are the contents of your fstab file?
Thank you for the tutorial!!!
After some tries finally seems I got it working, now I’m looking how to integrate on GitLab.
An quick hint for those with problems on coredns failing or dont get ready.
Don’t change the range IP, post says its possible to customize and use seed for change after install, all time I have did that, pod never started, reading the file:
https://docs.projectcalico.org/manifests/custom-resources.yaml
There’s the note:
# Note: The ipPools section cannot be modified post-install.
And the default range is 192.168.0.0/16
sudo sed -i ‘s/10.85.0.0/192.168.0.0/g’ /etc/cni/net.d/100-crio-bridge.conf
sed: can’t read /etc/cni/net.d/100-crio-bridge.conf: No such file or directory
Also seeing the following error which then stops anything further on working:
sudo kubeadm config images pull –cri-socket /var/run/docker.sock
failed to pull image “k8s.gcr.io/kube-apiserver:v1.23.3″: output: time=”2022-01-31T16:26:47Z” level=fatal msg=”connect: connect endpoint ‘unix:///var/run/docker.sock’, make sure you are running as root and the endpoint has been started: context deadline exceeded”
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
any ideas?
Are you using Docker container engine?. If so, is the service running?
We’ve updated the guide to use cri-dockerd shim. Also covered in a separate guide: https://computingforgeeks.com/install-mirantis-cri-dockerd-as-docker-engine-shim-for-kubernetes/
Hello
I have an issue with this tutorial : My install is in 1.24.0. All works good but i have a problem with tokens in service account. Any token is generated when i create sa, and i have no token in the default sa. Can you help me ?
Thx
Hello, i have an issue with this guide.
I have successfully installed my cluster with containerd and calico.
Every works good except one thing :
I haven’t any token generated for services account. When i type this command :
kubectl get sa
The default sa is present but with 0 secrets. The token section is set to
It’s similar when i tried to create a new sa, no token.
Can i have help please ?
Thanks
IF you need to specify a different pod cidr as “192.168.0.0/16” you need to edit the custom-resources.yaml for calico before applying.
Hi,
A great post, which helped me a lot.
I have been trying to host k8 on AWS EC2 Instances ( an task )
i have encountered two issues
1# at worker node
Found multiple CRI endpoints on the host. Please define which one do you wish to use by setting the ‘criSocket’ field in the kubeadm configuration file: unix:///var/run/containerd/containerd.sock, unix:///var/run/cri-dockerd.sock
To see the stack trace of this error execute with –v=5 or higher
for this i have tried
sudo kubeadm join {ip address} –token xgr4i8.kokoqjxo6yvs3d19 –discovery-token-ca-cert-hash sha256:d1b4e8ea7a87ae6b91198a2c0ac85663c48b2f2f9181d7075d13294336684379
–cri-socket /run/cri-dockerd.sock
but my command is being struck
2#
[preflight] Running pre-flight checks
[WARNING SystemVerification]: missing optional cgroups: blkio
with above message its been stuck, help me please
I installed the K8s cluster (1 master, 2 worker) using this guide. I used docker as Container Runtime. However, when I create Pods, the Pods in different worker nodes take overlapping Ips. Pod from one node can’t talk to pod from another node.
Took me a few attempts but got there in the end! Thanks for taking the time to put together.
Welcome Phil
Thank you very much.
If I’d add something to make this perfect is to add the details on how to generate the join token to add new workers.
There is a link to a separate guide under “Join new Kubernetes Worker Node to an existing Cluster” on joining new node to an existing cluster.
I followed this guide (docker runtime) with an additional step for Calico:
kubectl apply -f https://projectcalico.docs.tigera.io/v3.23/manifests/calico.yaml
An outstanding issue I have is even though I specified a pod network with kubeadm init, the pods are getting assigned addresses from the docker0 network (172.17.0.0/16) instead of the pod network.
Any ideas how to resolve? I found a few similar issues on Stackoverflow but haven’t been able to resolve mine.
I have problem about docker container cannot access external calls.
I build a simple application which do api call and it says “resource temporarily unavailable”, after trying to exec into container and do apt-get update it says that cannot connect to deb.
Any help about this ? i setup cluster as described above.
Very nice article !
I did manage to setup k8s but recently i started getting errors in my pods “resource temporarily unavailable” when doing external http call, anyone have some solution ?
btw, i tried to exec into container and do “apt-get update” but it says that i cannot connect to deb.
After installing the CRI-O runtime and using Calico as suggested, my calico-kube-controller is stuck in a Pending status.
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-657d56796-bvh25 0/1 Pending 0 30m
calico-system calico-node-tdkjf 1/1 Running 0 30m
calico-system calico-typha-89d87cb5c-fn887 1/1 Running 0 30m
kube-system coredns-6d4b75cb6d-dbbmk 1/1 Running 0 31m
kube-system coredns-6d4b75cb6d-twlkk 1/1 Running 0 31m
kube-system etcd-res-u20-template 1/1 Running 0 32m
kube-system kube-apiserver-res-u20-template 1/1 Running 0 32m
kube-system kube-controller-manager-res-u20-template 1/1 Running 0 32m
kube-system kube-proxy-cxmvt 1/1 Running 0 31m
kube-system kube-scheduler-res-u20-template 1/1 Running 0 32m
tigera-operator tigera-operator-6995cc5df5-4w9rt 1/1 Running 0 30m
The only thing in the log that drew my attention was
“E0808 23:12:25.258702 1 disruption.go:534] Error syncing PodDisruptionBudget calico-system/calico-typha, requeuing: Operation cannot be fulfilled on poddisruptionbudgets.policy “calico-typha”: the object has been modified; please apply your changes to the latest version and try again”
Has anyone seen this issue? suggestions?
After installing the CRI-O runtime and the Calico network plugin, the calico-kube-controllers stay in a `Pending` state.
senften@res-k8s-master:~$ kubectl get pods –all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
calico-system calico-kube-controllers-657d56796-qzhxw 0/1 Pending 0 17h
calico-system calico-node-62k9g 1/1 Running 0 17h
calico-system calico-typha-8fdfcb55c-k6zcq 1/1 Running 0 17h
kube-system coredns-6d4b75cb6d-bknbp 1/1 Running 0 17h
kube-system coredns-6d4b75cb6d-jkfk7 1/1 Running 0 17h
kube-system etcd-res-k8s-master 1/1 Running 0 17h
kube-system kube-apiserver-res-k8s-master 1/1 Running 0 17h
kube-system kube-controller-manager-res-k8s-master 1/1 Running 0 17h
kube-system kube-proxy-jht5p 1/1 Running 0 17h
kube-system kube-scheduler-res-k8s-master 1/1 Running 0 17h
tigera-operator tigera-operator-6995cc5df5-j64s6 1/1 Running 0 17h
I think it may be related to this bug, [calico-kube-controllers pending since not tolerate the “node-role.kubernetes.io/control-plane” · Issue #6087 · projectcalico/calico](https://github.com/projectcalico/calico/issues/6087), but, after reading through the issue, I am still uncertain how to proceed or even sure this is the issue I’ve run into.
Any advice? Anyone else seeing this?
Thanks
For single node setup allow Pods to run on master nodes – https://computingforgeeks.com/how-to-schedule-pods-on-kubernetes-control-plane-node/
Thank you. Greatly appreciated.
I also found an answer by new-mikha in the comments of https://github.com/projectcalico/calico/issues/6087 so I downloaded https://docs.projectcalico.org/manifests/custom-resources.yaml referenced in step 6 and added
controlPlaneTolerations:
– key: node-role.kubernetes.io/control-plane
effect: NoSchedule
– key: node-role.kubernetes.io/master
effect: NoSchedule
as described in the comment and then, probably obviously, created the resources with this edited file.
How do you add additional master node into the cluster?
While running this command
sudo kubeadm config images pull –cri-socket unix:///run/cri-dockerd.sock
i m getting this error
failed to pull image “registry.k8s.io/kube-apiserver:v1.25.2″: output: time=”2022-10-13T11:52:55+05:30″ level=fatal msg=”unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /run/cri-dockerd.sock: connect: no such file or directory\””
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
i don’t know why i m not able to create file for docker at the given file i m following all steps
Whenever i run this command sudo kubeadm config images pull –cri-socket unix:///run/cri-dockerd.sock
i m getting this following error failed to pull image “registry.k8s.io/kube-apiserver:v1.25.2″: output: time=”2022-10-13T12:29:44+05:30″ level=fatal msg=”unable to determine image API version: rpc error: code = Unavailable desc = connection error: desc = \”transport: Error while dialing dial unix /run/cri-dockerd.sock: connect: no such file or directory\””
, error: exit status 1
To see the stack trace of this error execute with –v=5 or higher
Any solution ??
thanks for ur blogs its helps most of them and reilf more stress .
Here im beginer to these .whats the use of docker . kubernates run upon or docker , or docker container runs inside the pod ? plzzz tell me im confused about all these help me out plzzz
See some resources that tries to explain the difference:
– https://www.dynatrace.com/news/blog/kubernetes-vs-docker/
– https://azure.microsoft.com/en-us/topic/kubernetes-vs-docker/
– https://www.ibm.com/cloud/blog/kubernetes-vs-docker
Phew, got my master node all setup finally, thanks for the workthrough. It’s briliant and requires some patience 🙂
Awesome… We are 😄 for you!
Thanks
Thanks for the comment and welcome
I followed your guide :
Host OS: Ubuntu 22.04
CNI and version: Flannel latest
CRI and version: CRI-O latest
kubeadm init –v=5 –pod-network-cidr=10.244.0.0/16 –upload-certs –control-plane-endpoint=k8s.mydomain.com
The cluster initialized successfully but coredns pods cannot started
kubectl version –short
Client Version: v1.26.2
Kustomize Version: v4.5.7
Server Version: v1.26.2
kubectl cluster-info
Kubernetes control plane is running at https://k8s.mydomain.com:6443
CoreDNS is running at https://k8s.mydomain.com:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-787d4945fb-g8hhj 0/1 CreateContainerError 0 7m1s
coredns-787d4945fb-nkssp 0/1 CreateContainerError 0 6m29s
etcd-k8s-52ts-master1 1/1 Running 1 12h
kube-apiserver-k8s-52ts-master1 1/1 Running 1 12h
kube-controller-manager-k8s-52ts-master1 1/1 Running 1 12h
kube-proxy-m4mhc 1/1 Running 1 12h
kube-proxy-rrdpb 1/1 Running 0 12h
kube-scheduler-k8s-52ts-master1 1/1 Running 1 12h
kubectl describe pod coredns-787d4945fb-g8hhj -n kube-system
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal Scheduled 65s default-scheduler Successfully assigned kube-system/coredns-787d4945fb-g8hhj to k8s-52ts-worker1
Warning Failed 64s kubelet Error: container create failed: time=”2023-03-17T08:26:01+07:00″ level=warning msg=”unable to get oom kill count” error=”openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-105d630363803602c6bfe9c1516b128192b95c519ff321ed1177fe0cacdc9b42.scope/memory.events: no such file or directory”
time=”2023-03-17T08:26:01+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
Warning Failed 63s kubelet Error: container create failed: time=”2023-03-17T08:26:02+07:00″ level=warning msg=”unable to get oom kill count” error=”openat2 /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podd45fe767_098b_401f_8da9_1a1760d804f6.slice/crio-2bbebc0f6e74a3c5a8f5e460e758e322a94ffe97964f34e357d831ce3fced345.scope/memory.events: no such file or directory”
time=”2023-03-17T08:26:02+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
Warning Failed 47s kubelet Error: container create failed: time=”2023-03-17T08:26:18+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
Warning Failed 34s kubelet Error: container create failed: time=”2023-03-17T08:26:31+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
Warning Failed 19s kubelet Error: container create failed: time=”2023-03-17T08:26:46+07:00″ level=error msg=”container_linux.go:380: starting container process caused: exec: \”/coredns\”: stat /coredns: no such file or directory”
Normal Pulled 9s (x6 over 64s) kubelet Container image “registry.k8s.io/coredns/coredns:v1.9.3” already present on machine
Hi,
Did you install network plugin?
Hi, I reinstalled cluster with docker CRI docker and Mirantis cri-dockerd as shim , CNI Flannel and it works now.
Hi ,
Thanks for this great tutorial, It’s really useful and helped me to bootstrap multiple clusters since 2022.
I want to mention two points and hope that it will help someone.
1 – For containerd, the old debian package 1.4 does not work with v1.25 and higher. the version that works it 1.6 . Here is one of the related errors :
failed to pull image “registry.k8s.io/kube-apiserver:v1.25.10″: output: time=”2023-05-21T10:16:39Z” level=fatal msg=”validate service connection: CRI v1 image API is not implemented for endpoint \”unix:///var/run/containerd/containerd.sock\”: rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService”
2 – The Urls on calico setup are not working anymore , the updated version could be found here :
https://docs.tigera.io/calico/latest/getting-started/kubernetes/self-managed-onprem/onpremises
Hi Josphat
Thanks for this guide, very detail and helpful.
Adding one more thing to the Calico installation part
curl https://raw.githubusercontent.com/projectcalico/calico/v3.25.1/manifests/custom-resources.yaml -O
as the resources.yaml seems using 192.168.0.0/16 as default cidr, but your cluster init command above is using –pod-network-cidr=172.24.0.0/16
If one want to deploy Calico network plugin, he must manually update the cidr in this yaml file to have
cidr: 172.24.0.0/16
Thanks for this powerful comment we’ve updated the article it now reflects.