OpenShift Container Platform is Red Hat’s enterprise Kubernetes distribution that automates installation, upgrades, and lifecycle management for containerized applications. All cluster nodes run Red Hat CoreOS (RHCOS), which includes the kubelet and CRI-O container runtime optimized for Kubernetes workloads.

This guide walks through deploying OpenShift 4.17 on a single KVM hypervisor using libvirt, with a bastion VM handling DNS, DHCP, load balancing, and PXE boot services. The KVM host runs RHEL 10, Rocky Linux 10, or Fedora. This setup works for proof-of-concept, lab, and development environments – production deployments need multiple hypervisors with HA load balancers.

Prerequisites for Deploying OpenShift on KVM

Before starting, make sure you have the following in place:

  • A bare-metal server or workstation running RHEL 10, Rocky Linux 10, or Fedora with KVM support (Intel VT-x or AMD-V)
  • At least 128 GB RAM and 16 CPU cores (for 1 bootstrap + 3 masters + 3 workers)
  • 600 GB+ free disk space on /var/lib/libvirt/images
  • Root or sudo access on the hypervisor
  • A Red Hat account with an active subscription to download the pull secret from console.redhat.com
  • A registered domain name or internal DNS zone (we use example.com throughout this guide)

Hardware Requirements per VM

Red Hat’s minimum and recommended hardware requirements for each cluster VM are shown below.

VM RoleOSvCPU (Min/Rec)RAMStorage
BootstrapRHCOS4 / 416 GB120 GB
Control PlaneRHCOS4 / 816 GB120 GB
Compute (Worker)RHCOS2 / 48 GB120 GB
Bastion/HelperFedora/RHEL24 GB20 GB

Lab Network Layout

Here is the network layout and IP allocation used throughout this guide. Replace domain, cluster name, and IP addresses with your own values.

  • Base domain: example.com
  • Cluster name: ocp4
  • KVM virtual network: openshift4 (192.168.100.0/24)
  • Gateway: 192.168.100.1
  • Bastion node (DNS, DHCP, HAProxy, TFTP, httpd): 192.168.100.254
HostnameMAC AddressIP Address
bootstrap.ocp4.example.com52:54:00:a4:db:5f192.168.100.10
master01.ocp4.example.com52:54:00:8b:a1:17192.168.100.11
master02.ocp4.example.com52:54:00:ea:8b:9d192.168.100.12
master03.ocp4.example.com52:54:00:f8:87:c7192.168.100.13
worker01.ocp4.example.com52:54:00:31:4a:39192.168.100.21
worker02.ocp4.example.com52:54:00:6a:37:32192.168.100.22
worker03.ocp4.example.com52:54:00:95:d4:ed192.168.100.23

Generate unique MAC addresses for your setup with this command:

date +%s | md5sum | head -c 6 | sed -e 's/\([0-9A-Fa-f]\{2\}\)/\1:/g' -e 's/\(.*\):$/\1/' | sed -e 's/^/52:54:00:/';echo

Step 1: Install KVM and Libvirt on the Hypervisor

First, confirm that your CPU supports hardware virtualization.

$ grep -cE 'vmx|svm' /proc/cpuinfo
8

Any value greater than 0 means virtualization extensions are present. Install the KVM/libvirt packages on your hypervisor. For RHEL 10 or Rocky Linux 10, you can install KVM as a KVM virtualization host with these commands:

sudo dnf install -y qemu-kvm libvirt virt-install virt-viewer \
  bridge-utils libguestfs-tools virt-manager libvirt-client

Enable and start the libvirtd service.

sudo systemctl enable --now libvirtd

Verify libvirtd is running.

$ sudo systemctl status libvirtd
● libvirtd.service - Virtualization daemon
     Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
     Active: active (running)

Create the OpenShift Virtual Network

Create a dedicated NAT virtual network for the OpenShift cluster. Write the network definition file.

sudo vim /tmp/openshift4-net.xml

Add the following content:

<network>
  <name>openshift4</name>
  <forward mode='nat'>
    <nat>
      <port start='1024' end='65535'/>
    </nat>
  </forward>
  <bridge name='openshift4' stp='on' delay='0'/>
  <domain name='openshift4'/>
  <ip address='192.168.100.1' netmask='255.255.255.0'>
  </ip>
</network>

Define, autostart, and start the network.

$ sudo virsh net-define --file /tmp/openshift4-net.xml
Network openshift4 defined from /tmp/openshift4-net.xml

$ sudo virsh net-autostart openshift4
Network openshift4 marked as autostarted

$ sudo virsh net-start openshift4
Network openshift4 started

Confirm the bridge interface is active.

$ ip link show openshift4
5: openshift4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 state DOWN
    link/ether 52:54:00:2b:47:9a brd ff:ff:ff:ff:ff:ff

Step 2: Create the Bastion / Helper VM

The bastion VM hosts all supporting infrastructure services for the OpenShift deployment: BIND DNS, DHCP, HAProxy load balancer, TFTP/PXE boot, and Apache httpd for serving RHCOS images and ignition files. It also serves as the management workstation where we run the openshift-install and oc commands.

Build a Fedora 41 VM image using virt-builder. You can also use a Rocky Linux or CentOS Stream image.

sudo virt-builder fedora-41 --format qcow2 \
  --size 20G -o /var/lib/libvirt/images/ocp-bastion-server.qcow2 \
  --root-password password:StrongRootPassw0rd

Create and import the VM.

sudo virt-install \
  --name ocp-bastion-server \
  --ram 4096 \
  --vcpus 2 \
  --disk path=/var/lib/libvirt/images/ocp-bastion-server.qcow2 \
  --os-variant fedora41 \
  --network bridge=openshift4 \
  --graphics none \
  --serial pty \
  --console pty \
  --boot hd \
  --import

Log in as root with the password set above. Set a static IP on the bastion VM so all cluster nodes can reach it consistently.

nmcli con delete "Wired connection 1"
nmcli con add type ethernet con-name enp1s0 ifname enp1s0 \
  connection.autoconnect yes ipv4.method manual \
  ipv4.address 192.168.100.254/24 ipv4.gateway 192.168.100.1 \
  ipv4.dns 8.8.8.8

Test connectivity from the bastion VM.

# ping -c 2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=4.98 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=5.14 ms

Update the OS and install required packages.

sudo dnf -y upgrade
sudo dnf -y install git vim wget curl bash-completion tree tar firewalld bind-utils

Reboot after the upgrade completes.

sudo reboot

Reconnect to the bastion via virsh console or SSH from the hypervisor.

$ sudo virsh console ocp-bastion-server
Connected to domain 'ocp-bastion-server'
Escape character is ^] (Ctrl + ])

Enable domain autostart so the bastion VM starts automatically when the hypervisor boots.

sudo virsh autostart ocp-bastion-server

Step 3: Configure Ansible Variables on Bastion

We use Ansible to automate configuration of DNS, DHCP, HAProxy, and PXE on the bastion node. Install Ansible on the bastion.

sudo dnf -y install ansible-core vim wget curl bash-completion tree tar

Clone the OCP4 Ansible helper repository that contains all tasks and Jinja2 templates.

cd ~/
git clone https://github.com/jmutai/ocp4_ansible.git
cd ~/ocp4_ansible

Edit the variables file to match your lab environment. Every IP, MAC address, domain, and cluster name must be correct – errors here will break the entire deployment.

vim vars/main.yml

Set all values carefully. Here is the full variables file with comments explaining each field:

---
ppc64le: false
uefi: false
disk: vda
helper:
  name: "bastion"
  ipaddr: "192.168.100.254"
  networkifacename: "ens3"
dns:
  domain: "example.com"
  clusterid: "ocp4"
  forwarder1: "8.8.8.8"
  forwarder2: "1.1.1.1"
  lb_ipaddr: "{{ helper.ipaddr }}"
dhcp:
  router: "192.168.100.1"
  bcast: "192.168.100.255"
  netmask: "255.255.255.0"
  poolstart: "192.168.100.10"
  poolend: "192.168.100.50"
  ipid: "192.168.100.0"
  netmaskid: "255.255.255.0"
  ntp: "time.google.com"
  dns: ""
bootstrap:
  name: "bootstrap"
  ipaddr: "192.168.100.10"
  macaddr: "52:54:00:a4:db:5f"
masters:
  - name: "master01"
    ipaddr: "192.168.100.11"
    macaddr: "52:54:00:8b:a1:17"
  - name: "master02"
    ipaddr: "192.168.100.12"
    macaddr: "52:54:00:ea:8b:9d"
  - name: "master03"
    ipaddr: "192.168.100.13"
    macaddr: "52:54:00:f8:87:c7"
workers:
  - name: "worker01"
    ipaddr: "192.168.100.21"
    macaddr: "52:54:00:31:4a:39"
  - name: "worker02"
    ipaddr: "192.168.100.22"
    macaddr: "52:54:00:6a:37:32"
  - name: "worker03"
    ipaddr: "192.168.100.23"
    macaddr: "52:54:00:95:d4:ed"

Verify the ansible.cfg and inventory files are correctly set.

$ cat ansible.cfg
[defaults]
inventory = inventory
host_key_checking = False
deprecation_warnings = False
retry_files = false

$ cat inventory
[vms_host]
localhost ansible_connection=local

Step 4: Install and Configure DHCP Server

The DHCP server assigns fixed IP addresses to each cluster node based on its MAC address. This is critical because OpenShift nodes must get consistent IPs that match the DNS records. Install the dhcp-server package on the bastion.

sudo dnf -y install dhcp-server

Enable the dhcpd service.

sudo systemctl enable dhcpd

Backup the default configuration and run the Ansible playbook to generate the correct DHCP config from your variables.

sudo mv /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf.bak
ansible-playbook tasks/configure_dhcpd.yml

Expected output shows the template being applied and dhcpd restarting.

PLAY RECAP *************************************************************
localhost  : ok=3    changed=2    unreachable=0    failed=0    skipped=1

Confirm dhcpd is running.

$ systemctl status dhcpd
● dhcpd.service - DHCPv4 Server Daemon
     Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; enabled)
     Active: active (running)

Step 5: Configure BIND DNS Server

OpenShift requires specific DNS records for API, etcd SRV records, and wildcard entries for applications. Without correct DNS, the cluster installation fails. Install BIND on the bastion.

sudo dnf -y install bind bind-utils

Enable the named service.

sudo systemctl enable named

Install the DNS serial number generator script used by the Ansible templates.

sudo cp files/set-dns-serial.sh /usr/local/bin/
sudo chmod a+x /usr/local/bin/set-dns-serial.sh

Run the DNS configuration playbook. This creates both the forward zone (ocp4.example.com) and the reverse zone files with all required A, SRV, and CNAME records.

$ ansible-playbook tasks/configure_bind_dns.yml

PLAY RECAP *************************************************************
localhost  : ok=7    changed=6    unreachable=0    failed=0    skipped=0

Verify the DNS service is running and responding correctly. Test the etcd SRV records that OpenShift requires.

$ dig @127.0.0.1 -t srv _etcd-server-ssl._tcp.ocp4.example.com +short
0 10 2380 etcd-0.ocp4.example.com.
0 10 2380 etcd-1.ocp4.example.com.
0 10 2380 etcd-2.ocp4.example.com.

Test the API and wildcard DNS entries.

$ dig @127.0.0.1 api.ocp4.example.com +short
192.168.100.254

$ dig @127.0.0.1 *.apps.ocp4.example.com +short
192.168.100.254

$ dig @127.0.0.1 bootstrap.ocp4.example.com +short
192.168.100.10

Point the bastion’s own DNS resolver to itself.

nmcli connection modify enp1s0 ipv4.dns "192.168.100.254"
nmcli connection reload
nmcli connection up enp1s0

Open the required firewall ports on the bastion for all services.

sudo firewall-cmd --add-service={dhcp,tftp,http,https,dns} --permanent
sudo firewall-cmd --reload

Step 6: Setup TFTP and PXE Boot Services

PXE boot allows the OpenShift cluster VMs to network-boot directly into the RHCOS installer without manual ISO mounting. Install TFTP and syslinux packages.

sudo dnf -y install tftp-server syslinux

Enable and start the TFTP service.

sudo systemctl enable --now tftp

Prepare the PXE boot directory structure.

sudo mkdir -p /var/lib/tftpboot/pxelinux.cfg
sudo cp -rvf /usr/share/syslinux/* /var/lib/tftpboot
sudo mkdir -p /var/lib/tftpboot/rhcos

Download the RHCOS kernel and initramfs files from the RHCOS image mirror.

wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-installer-kernel-x86_64
sudo mv rhcos-installer-kernel-x86_64 /var/lib/tftpboot/rhcos/kernel

Download the initramfs image.

wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-installer-initramfs.x86_64.img
sudo mv rhcos-installer-initramfs.x86_64.img /var/lib/tftpboot/rhcos/initramfs.img

Restore SELinux context on the TFTP files.

sudo restorecon -RFv /var/lib/tftpboot/rhcos

Configure Apache httpd for RHCOS Images

Apache serves the rootfs image and ignition files to the nodes during PXE boot. Install httpd.

sudo dnf -y install httpd

Change httpd to listen on port 8080 instead of 80, since HAProxy will use ports 80 and 443. Edit /etc/httpd/conf/httpd.conf and change the Listen directive.

sudo sed -i 's/^Listen 80$/Listen 8080/' /etc/httpd/conf/httpd.conf

Remove the default welcome page and start httpd.

sudo rm -f /etc/httpd/conf.d/welcome.conf
sudo systemctl enable --now httpd

Open port 8080 in the firewall.

sudo firewall-cmd --add-port=8080/tcp --permanent
sudo firewall-cmd --reload

Download the RHCOS rootfs image and place it in the httpd directory.

sudo mkdir -p /var/www/html/rhcos
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-live-rootfs.x86_64.img
sudo mv rhcos-live-rootfs.x86_64.img /var/www/html/rhcos/rootfs.img
sudo restorecon -RFv /var/www/html/rhcos

Now run the Ansible playbook to generate the PXE boot configuration files for each node. Each file is named using the node’s MAC address and contains the correct kernel arguments, rootfs URL, and ignition file URL.

$ ansible-playbook tasks/configure_tftp_pxe.yml

PLAY RECAP *************************************************************
localhost  : ok=5    changed=4    unreachable=0    failed=0    skipped=0

Verify the PXE config files were created for all nodes.

$ ls -1 /var/lib/tftpboot/pxelinux.cfg/
01-52-54-00-31-4a-39
01-52-54-00-6a-37-32
01-52-54-00-8b-a1-17
01-52-54-00-95-d4-ed
01-52-54-00-a4-db-5f
01-52-54-00-ea-8b-9d
01-52-54-00-f8-87-c7

Step 7: Configure HAProxy Load Balancer

HAProxy distributes traffic across the OpenShift API servers (port 6443), machine config server (port 22623), and application routes (ports 80 and 443). Install HAProxy.

sudo dnf install -y haproxy

Allow HAProxy to bind to any port via SELinux.

sudo setsebool -P haproxy_connect_any 1

Back up the default config and run the Ansible playbook to generate the OpenShift-specific HAProxy configuration.

sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.default
ansible-playbook tasks/configure_haproxy_lb.yml

Configure SELinux for the custom ports.

sudo semanage port -a 6443 -t http_port_t -p tcp
sudo semanage port -a 22623 -t http_port_t -p tcp
sudo semanage port -a 32700 -t http_port_t -p tcp

Open firewall ports for API and application traffic.

sudo firewall-cmd --add-service={http,https} --permanent
sudo firewall-cmd --add-port={6443,22623}/tcp --permanent
sudo firewall-cmd --reload

Confirm HAProxy is active.

$ systemctl status haproxy
● haproxy.service - HAProxy Load Balancer
     Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled)
     Active: active (running)

Step 8: Install OpenShift Installer and CLI

Download the latest OpenShift 4.17 client and installer binaries from the official mirror. Run these on the bastion node.

wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux.tar.gz
tar xvf openshift-client-linux.tar.gz
sudo mv oc kubectl /usr/local/bin
rm -f README.md LICENSE openshift-client-linux.tar.gz

Download and install the openshift-install binary.

wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux.tar.gz
tar xvf openshift-install-linux.tar.gz
sudo mv openshift-install /usr/local/bin
rm -f README.md LICENSE openshift-install-linux.tar.gz

Verify all three binaries work.

$ openshift-install version
openshift-install 4.17.9
built from commit 15a9f9e5af6dee9caece196f8d20e6b3cd0cc2cc
release image quay.io/openshift-release-dev/ocp-release@sha256:cc7433a6c0a3e698630992bd8d63dbbc25659dfc292d4bfa6882a977010e12a3

$ oc version --client
Client Version: 4.17.9

$ kubectl version --client
Client Version: v1.30.5

Generate an SSH key pair for accessing CoreOS nodes during troubleshooting.

ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa

Step 9: Generate Ignition Files

Ignition files contain the cluster configuration that RHCOS nodes consume during first boot. You need a pull secret from Red Hat and an install-config.yaml file to generate them.

Download the Pull Secret

Visit console.redhat.com and download your pull secret. Save it on the bastion.

mkdir -p ~/.openshift
vim ~/.openshift/pull-secret

Paste the pull secret JSON content and save the file.

Create install-config.yaml

Create the base installation configuration file. The baseDomain must match your DNS domain, and metadata.name must match the cluster ID configured in DNS.

vim ~/install-config-base.yaml

Add the following content, replacing the pull secret and SSH key with your actual values:

apiVersion: v1
baseDomain: example.com
compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0
controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3
metadata:
  name: ocp4
networking:
  clusterNetworks:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
platform:
  none: {}
fips: false
pullSecret: 'PASTE-YOUR-PULL-SECRET-HERE'
sshKey: 'PASTE-YOUR-SSH-PUBLIC-KEY-HERE'

The compute.replicas: 0 setting means workers will be added manually after the initial install. The platform: none tells OpenShift this is a bare-metal/UPI deployment. OVNKubernetes is the default network plugin for OpenShift 4.17.

Create the working directory and copy the config into it. The openshift-install command deletes install-config.yaml during manifest generation, so always keep a backup copy.

mkdir -p ~/ocp4
cp ~/install-config-base.yaml ~/ocp4/install-config.yaml
cd ~/ocp4

Generate the Kubernetes manifests.

$ openshift-install create manifests
INFO Consuming Install Config from target directory
INFO Manifests created in: manifests and openshift

Disable pod scheduling on master nodes so only workers run application workloads.

sed -i 's/mastersSchedulable: true/mastersSchedulable: false/' manifests/cluster-scheduler-02-config.yml

Generate the ignition files.

$ openshift-install create ignition-configs
INFO Consuming Common Manifests from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Master Machines from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Openshift Manifests from target directory
INFO Ignition-Configs created in: . and auth

Verify the files were created.

$ ls ~/ocp4/
auth  bootstrap.ign  master.ign  metadata.json  worker.ign

Copy the ignition files to the httpd server directory so nodes can fetch them during PXE boot.

sudo mkdir -p /var/www/html/ignition
sudo cp -v ~/ocp4/*.ign /var/www/html/ignition/
sudo chmod 644 /var/www/html/ignition/*.ign
sudo restorecon -RFv /var/www/html/

Ensure all bastion services are running before creating cluster VMs.

sudo systemctl enable --now haproxy dhcpd httpd tftp named
sudo systemctl restart haproxy dhcpd httpd tftp named

Step 10: Create Bootstrap, Master, and Worker VMs

All commands in this step run on the KVM hypervisor host, not the bastion VM. Each VM PXE boots from the network, pulls the correct ignition file based on its MAC address, installs RHCOS, and reboots.

Bootstrap Node

Create the bootstrap VM first. It initializes the etcd cluster and control plane components.

sudo virt-install -n bootstrap.ocp4.example.com \
  --description "Bootstrap Machine for OpenShift 4 Cluster" \
  --ram=16384 \
  --vcpus=4 \
  --os-variant=rhel9-unknown \
  --noreboot \
  --disk pool=default,bus=virtio,size=120 \
  --graphics none \
  --serial pty \
  --console pty \
  --pxe \
  --network bridge=openshift4,mac=52:54:00:a4:db:5f

Once the PXE install completes and the VM shuts down, start it.

sudo virsh start bootstrap.ocp4.example.com

Master Nodes

Create all three master (control plane) nodes.

sudo virt-install -n master01.ocp4.example.com \
  --description "Master01 for OpenShift 4 Cluster" \
  --ram=16384 \
  --vcpus=8 \
  --os-variant=rhel9-unknown \
  --noreboot \
  --disk pool=default,bus=virtio,size=120 \
  --graphics none \
  --serial pty \
  --console pty \
  --pxe \
  --network bridge=openshift4,mac=52:54:00:8b:a1:17

Repeat for master02 and master03 with their respective MAC addresses.

sudo virt-install -n master02.ocp4.example.com \
  --description "Master02 for OpenShift 4 Cluster" \
  --ram=16384 \
  --vcpus=8 \
  --os-variant=rhel9-unknown \
  --noreboot \
  --disk pool=default,bus=virtio,size=120 \
  --graphics none \
  --serial pty \
  --console pty \
  --pxe \
  --network bridge=openshift4,mac=52:54:00:ea:8b:9d

Create master03.

sudo virt-install -n master03.ocp4.example.com \
  --description "Master03 for OpenShift 4 Cluster" \
  --ram=16384 \
  --vcpus=8 \
  --os-variant=rhel9-unknown \
  --noreboot \
  --disk pool=default,bus=virtio,size=120 \
  --graphics none \
  --serial pty \
  --console pty \
  --pxe \
  --network bridge=openshift4,mac=52:54:00:f8:87:c7

Start all master nodes after RHCOS installation completes.

sudo virsh start master01.ocp4.example.com
sudo virsh start master02.ocp4.example.com
sudo virsh start master03.ocp4.example.com

Worker Nodes

Create the worker nodes using a similar process.

for i in 01 02 03; do
  case $i in
    01) MAC="52:54:00:31:4a:39" ;;
    02) MAC="52:54:00:6a:37:32" ;;
    03) MAC="52:54:00:95:d4:ed" ;;
  esac
  sudo virt-install -n worker${i}.ocp4.example.com \
    --description "Worker${i} for OpenShift 4 Cluster" \
    --ram=8192 \
    --vcpus=4 \
    --os-variant=rhel9-unknown \
    --noreboot \
    --disk pool=default,bus=virtio,size=120 \
    --graphics none \
    --serial pty \
    --console pty \
    --pxe \
    --network bridge=openshift4,mac=$MAC
done

Start the worker nodes after installation completes.

sudo virsh start worker01.ocp4.example.com
sudo virsh start worker02.ocp4.example.com
sudo virsh start worker03.ocp4.example.com

You can check PXE boot and DHCP logs on the bastion to troubleshoot any issues.

journalctl -f -u tftp
journalctl -f -u dhcpd

Step 11: Monitor Installation Progress

Back on the bastion node, monitor the bootstrap process. This step waits for the bootstrap node to bring up the temporary control plane and hand off to the master nodes.

cd ~/ocp4
openshift-install wait-for bootstrap-complete --log-level=info

Expected output when bootstrap succeeds:

INFO Waiting up to 20m0s for the Kubernetes API at https://api.ocp4.example.com:6443...
INFO API v1.30.5+5765b01 up
INFO Waiting up to 30m0s for bootstrapping to complete...
INFO It is now safe to remove the bootstrap resources
INFO Time elapsed: 15m22s

Once you see the “safe to remove bootstrap” message, shut down and delete the bootstrap VM from the hypervisor to free resources.

sudo virsh destroy bootstrap.ocp4.example.com
sudo virsh undefine bootstrap.ocp4.example.com --remove-all-storage

Also remove the bootstrap backend from the HAProxy configuration on the bastion. Edit /etc/haproxy/haproxy.cfg and comment out or remove the bootstrap server lines from the api-server and machine-config-server backends. Then restart HAProxy.

sudo systemctl restart haproxy

Approve Worker CSRs

Worker nodes need their certificate signing requests (CSRs) approved before they join the cluster. Export the kubeconfig and check for pending CSRs.

export KUBECONFIG=~/ocp4/auth/kubeconfig

List pending CSRs.

$ oc get csr | grep Pending
csr-abc12   3m    system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
csr-def34   3m    system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending
csr-ghi56   3m    system:serviceaccount:openshift-machine-config-operator:node-bootstrapper   Pending

Approve all pending CSRs. You may need to run this command twice as a second set of CSRs appears after the first batch is approved.

oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

Wait for the full installation to complete.

$ openshift-install wait-for install-complete --log-level=info
INFO Waiting up to 40m0s for the cluster at https://api.ocp4.example.com:6443 to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run
    export KUBECONFIG=/root/ocp4/auth/kubeconfig
INFO Access the OpenShift web console here: https://console-openshift-console.apps.ocp4.example.com
INFO Login to the console with user: "kubeadmin", password: "AbCdE-FgHiJ-KlMnO-PqRsT"
INFO Time elapsed: 28m15s

Verify all nodes are in Ready state.

$ oc get nodes
NAME                          STATUS   ROLES           AGE   VERSION
master01.ocp4.example.com     Ready    control-plane   30m   v1.30.5+5765b01
master02.ocp4.example.com     Ready    control-plane   30m   v1.30.5+5765b01
master03.ocp4.example.com     Ready    control-plane   30m   v1.30.5+5765b01
worker01.ocp4.example.com     Ready    worker          10m   v1.30.5+5765b01
worker02.ocp4.example.com     Ready    worker          10m   v1.30.5+5765b01
worker03.ocp4.example.com     Ready    worker          10m   v1.30.5+5765b01

Check that all cluster operators are available.

$ oc get clusteroperators
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED
authentication                             4.17.9    True        False         False
cloud-credential                           4.17.9    True        False         False
cluster-autoscaler                         4.17.9    True        False         False
config-operator                            4.17.9    True        False         False
console                                    4.17.9    True        False         False
dns                                        4.17.9    True        False         False
etcd                                       4.17.9    True        False         False
...

Step 12: Access the OpenShift Web Console

The OpenShift web console is accessible at https://console-openshift-console.apps.ocp4.example.com. Since this is a lab environment, you need to resolve this DNS name to the bastion IP (192.168.100.254) from your workstation. Add these entries to your workstation’s /etc/hosts file or configure your workstation to use the bastion as its DNS server.

192.168.100.254 console-openshift-console.apps.ocp4.example.com
192.168.100.254 oauth-openshift.apps.ocp4.example.com
192.168.100.254 api.ocp4.example.com

Log in using the kubeadmin credentials displayed at the end of the installation. The password is stored in ~/ocp4/auth/kubeadmin-password on the bastion.

cat ~/ocp4/auth/kubeadmin-password

You can also log in from the command line.

oc login -u kubeadmin -p $(cat ~/ocp4/auth/kubeadmin-password) https://api.ocp4.example.com:6443

Step 13: Add Additional Worker Nodes

To add more worker nodes after the initial deployment, generate a new MAC address for the node, add it to the Ansible variables, regenerate the DHCP and DNS configs, and create a new PXE boot entry. Then create the VM using virt-install with PXE boot and approve the pending CSR.

Update the vars/main.yml file with the new worker details.

vim ~/ocp4_ansible/vars/main.yml

Add a new entry under the workers section, for example:

  - name: "worker04"
    ipaddr: "192.168.100.24"
    macaddr: "52:54:00:aa:bb:cc"

Re-run the Ansible playbooks to update DHCP, DNS, and PXE.

cd ~/ocp4_ansible
ansible-playbook tasks/configure_dhcpd.yml
ansible-playbook tasks/configure_bind_dns.yml
ansible-playbook tasks/configure_tftp_pxe.yml

Create the new worker VM on the hypervisor and approve its CSR after it boots.

oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve

Step 14: Setup Persistent Storage with NFS

OpenShift needs persistent storage for the internal image registry and stateful workloads. For a lab setup, NFS on Rocky Linux is a quick option. Configure an NFS export on the bastion or a dedicated storage node.

Install the NFS server on the bastion.

sudo dnf install -y nfs-utils
sudo systemctl enable --now nfs-server

Create the export directory and set permissions.

sudo mkdir -p /srv/nfs/ocp-registry
sudo chmod 777 /srv/nfs/ocp-registry

Add the export to /etc/exports.

echo "/srv/nfs/ocp-registry 192.168.100.0/24(rw,sync,no_root_squash,no_subtree_check)" | sudo tee -a /etc/exports

Export the shares and open the firewall.

sudo exportfs -rav
sudo firewall-cmd --add-service=nfs --permanent
sudo firewall-cmd --reload

Create a PersistentVolume in OpenShift for the image registry.

oc apply -f - <<YAML
apiVersion: v1
kind: PersistentVolume
metadata:
  name: image-registry-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  nfs:
    path: /srv/nfs/ocp-registry
    server: 192.168.100.254
  persistentVolumeReclaimPolicy: Retain
YAML

Create the PersistentVolumeClaim and patch the image registry operator to use it.

oc apply -f - <<YAML
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: image-registry-storage
  namespace: openshift-image-registry
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Gi
YAML

Patch the registry operator to use the PVC and set it to Managed.

oc patch configs.imageregistry.operator.openshift.io cluster \
  --type merge --patch '{"spec":{"storage":{"pvc":{"claim":"image-registry-storage"}},"managementState":"Managed"}}'

Verify the registry pods are running.

$ oc get pods -n openshift-image-registry
NAME                                               READY   STATUS    RESTARTS   AGE
image-registry-5c8f7b6d4f-x2k9j                   1/1     Running   0          2m

Day-2 Operations Overview

Once the cluster is running, there are several day-2 tasks to handle for a production-ready environment:

  • Identity provider – Replace kubeadmin with an identity provider like LDAP, HTPasswd, or OpenID Connect. Run oc delete secret kubeadmin -n kube-system after configuring an alternative admin user
  • Cluster certificates – Replace the default self-signed certificates with trusted TLS certificates for the API and wildcard ingress routes
  • Monitoring and alerting – OpenShift ships with Prometheus and Alertmanager. Configure persistent storage for monitoring data and set up alert receivers
  • Cluster logging – Deploy the OpenShift Logging Operator (based on Loki or Elasticsearch) to collect and store container logs
  • Node scaling – Add or remove worker nodes as workload demand changes. In a KVM environment, this means creating or destroying VMs and approving CSRs
  • Cluster updates – Use the OpenShift web console or oc adm upgrade to apply platform updates. Test upgrades in a staging environment first
  • Backup and disaster recovery – Back up etcd regularly using etcdctl snapshot save from a master node. Store backups off-cluster

Conclusion

You now have a fully functional OpenShift 4.17 cluster running on KVM/libvirt with three control plane nodes and three worker nodes. The bastion VM provides all supporting infrastructure – DNS, DHCP, load balancing, and PXE boot services. For production use, replace the single HAProxy instance with a highly available load balancer, use dedicated DNS infrastructure, add proper TLS certificates, and configure an external storage solution like Ceph or GlusterFS for persistent storage.

Related Guides

56 COMMENTS

  1. First of all, I’m new to this openshift environment but this document has given me a clear picture of ocp4 installaion. Thank you so much for sharing this document and it helps me a lot 🙂

  2. Hi Josphat,

    first thank you for the tutorial learn alot. Iam actually having issue with dns, after I run ansible, I stoped in this bug as below:

    [root@fedora ocp4_ansible]# nmcli connection up enp1s0
    Error: Connection activation failed: IP configuration could not be reserved (no available address, timeout, etc.)
    Hint: use ‘journalctl -xe NM_CONNECTION=3771b159-c3a1-4e68-9b93-ea8150ee0ec5 + NM_DEVICE=enp1s0’ to get more details.

    is there any step I need to do on the bastion server or on my main fedora server?
    I apreaiate your support on this.

      • iam using the same subnet like yours.
        so Iam really not sure how to get it work, really trying for 4 days with this issue. I appreciate your support Josphat.
        ————————————————
        [root@fedora etc]# systemctl status named
        ● named.service – Berkeley Internet Name Domain (DNS)
        Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor pre>
        Active: active (running) since Fri 2021-12-24 22:20:30 EST; 9h ago
        Process: 2903 ExecStartPre=/bin/bash -c if [ ! “$DISABLE_ZONE_CHECKING” == >
        Process: 2905 ExecStart=/usr/sbin/named -u named -c ${NAMEDCONF} $OPTIONS (>
        Main PID: 2906 (named)
        Tasks: 8 (limit: 4668)
        Memory: 24.6M
        CPU: 163ms
        CGroup: /system.slice/named.service
        └─2906 /usr/sbin/named -u named -c /etc/named.conf

        Dec 24 22:20:30 fedora named[2906]: zone ocp4.example.com/IN: loaded serial 202>
        Dec 24 22:20:30 fedora named[2906]: zone localhost.localdomain/IN: loaded seria>
        Dec 24 22:20:30 fedora named[2906]: all zones loaded
        Dec 24 22:20:30 fedora systemd[1]: Started Berkeley Internet Name Domain (DNS).
        Dec 24 22:20:30 fedora named[2906]: running
        Dec 24 22:20:30 fedora named[2906]: managed-keys-zone: Key 20326 for zone . is >
        Dec 24 22:31:38 fedora named[2906]: no longer listening on 192.168.100.254#53
        Dec 24 22:32:24 fedora named[2906]: listening on IPv4 interface enp1s0, 192.168>
        Dec 24 22:35:46 fedora named[2906]: no longer listening on 192.168.100.254#53
        Dec 24 22:36:32 fedora named[2906]: listening on IPv4 interface enp1s0, 192.168
        ——————————–
        [root@fedora dhcp]# cat dhcpd.conf
        authoritative;
        ddns-update-style interim;
        default-lease-time 14400;
        max-lease-time 14400;
        allow booting;
        allow bootp;

        option routers 192.168.100.1;
        option broadcast-address 192.168.100.255;
        option subnet-mask 255.255.255.0;
        option domain-name-servers 192.168.100.254;
        option ntp-servers time.google.com;
        option domain-name “ocp4.example.com”;
        option domain-search “ocp4.example.com”, “example.com”;

        subnet 192.168.100.0 netmask 255.255.255.0 {
        interface enp1s0;
        pool {
        range 192.168.100.10 192.168.100.50;
        # Static entries
        host bootstrap { hardware ethernet 52:54:00:a4:db:5f; fixed-address 192.168.100.10; }
        host master01 { hardware ethernet 52:54:00:8b:a1:17; fixed-address 192.168.100.11; }
        host master02 { hardware ethernet 52:54:00:ea:8b:9d; fixed-address 192.168.100.12; }
        host master03 { hardware ethernet 52:54:00:f8:87:c7; fixed-address 192.168.100.13; }
        host worker01 { hardware ethernet 52:54:00:31:4a:39; fixed-address 192.168.100.21; }
        host worker02 { hardware ethernet 52:54:00:6a:37:32; fixed-address 192.168.100.22; }
        host worker03 { hardware ethernet 52:54:00:95:d4:ed; fixed-address 192.168.100.23; }
        # this will not give out addresses to hosts not listed above
        deny unknown-clients;

        # this is PXE specific
        filename “pxelinux.0”;

        next-server 192.168.100.254;
        }
        }

  3. I have problem with bastion/helper MAC addresse,

    [root@fedora ocp4_ansible]# systemctl status dhcpd
    ● dhcpd.service – DHCPv4 Server Daemon
    Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; enabled; vendor pre>
    Active: active (running) since Sat 2021-12-25 08:56:12 EST; 3min 37s ago
    Docs: man:dhcpd(8)
    man:dhcpd.conf(5)
    Main PID: 4423 (dhcpd)
    Status: “Dispatching packets…”
    Tasks: 1 (limit: 4668)
    Memory: 9.7M
    CPU: 9ms
    CGroup: /system.slice/dhcpd.service
    └─4423 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -gr>

    Dec 25 08:56:12 fedora dhcpd[4423]: Listening on LPF/enp1s0/52:54:00:bd:0e:99/1>
    Dec 25 08:56:12 fedora dhcpd[4423]: Sending on LPF/enp1s0/52:54:00:bd:0e:99/1>
    Dec 25 08:56:12 fedora dhcpd[4423]: Sending on Socket/fallback/fallback-net
    Dec 25 08:56:12 fedora dhcpd[4423]: Server starting service.
    Dec 25 08:56:12 fedora systemd[1]: Started DHCPv4 Server Daemon.
    Dec 25 08:58:48 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
    Dec 25 08:58:50 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
    Dec 25 08:58:55 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
    Dec 25 08:59:03 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
    Dec 25 08:59:20 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>

  4. yes it worked, with virt-manager excellent through the UI and managing the images. Thank you alot for this amazing deep dive into openshift preparing the prerequisites.

  5. After installing openshift 4.9 with your howto I cannot reach the webconsole. When checking the logs for the console pod:
    0128 13:24:18.253509 1 auth.go:231] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp4.lagrange.com/oauth/token failed: Head “https://oauth-openshift.apps.ocp4.lagrange.com”: EOF
    E0128 13:24:28.293887 1 auth.go:231] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp4.lagrange.com/oauth/token failed: Head “https://oauth-openshift.apps.ocp4.lagrange.com”: EOF

    What is wrong?

  6. I’m running into an error with the workers and masters timing out trying to get to the ignition scripts; there’s a timeout trying to reach https://api-int, which is listening. I created a self-signed cert and added that to the haproxy config, but am still seeing the tcp dial io timeout.

    Any ideas?

  7. I’d submitted a comment about having a timeout / possible ssl issue, but that was entirely my networking. Working now. This is an awesome step-by-step, and it has given me a lot of valuable stick time on this openshift nonsense. I genuinely appreciate what you’ve done here. Thanks very much!

  8. [root@api ~]# oc get co authentication
    NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
    authentication 4.9.19 False False True 7m7s OAuthServerRouteEndpointAccessibleControllerAvailable: route.route.openshift.io “oauth-openshift” not found
    OAuthServerServiceEndpointAccessibleControllerAvailable: Get “https://172.30.21.224:443/healthz”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
    OAuthServerServiceEndpointsEndpointAccessibleControllerAvailable: endpoints “oauth-openshift” not found
    ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 1 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods).
    WellKnownAvailable: The well-known endpoint is not yet available: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap “oauth-openshift” not found (check authentication operator, it is supposed to create this)

    any idea?

  9. Hi, Josphat,

    Thank you for your very useful blog.

    In section: Download Pull Secret
    You mention:
    Visit cloud.redhat.com and select “Bare Metal” then “UPI“.

    I have not idea how to get those secret from this description. There si no “Bare Metal” in the page of cloud.redhat.com.

    Could you help on that?

    Thanks

  10. Hi Josphat,

    I’ve followed your guide to the letter. It’s been incredibly educational and I learned a lot on how OCP operates. What I’m completely stuck on is accessing the Openshift console. Is there a particular forwarding piece I’m missing here? How is everyone else doing it?

    I’ve been running this on KVM on RHEL8.x. How can I access the openshift console externally?

    Thanks

    • We’re delighted at least you got the setup up and running. Access to console is through HAProxy LB and mapped DNS name.

      You can get console address by running the commands:

      oc whoami --show-console

      If your DNS is working ok you should be able to access OpenShift Console.

      • I’m still a bit lost here. If I’m logged into the bastion vm, all this works fine. OC commands work, etc. I can even login to the cluster from the KVM host itself. What I can’t seem to understand is how to get to the openshift console from outside the KVM host. The KVM host lives in a datacenter and is just running RHEL 8 server with no desktop. HAProxy works fine on the KVM host as it can get to all the services on the cluster running internally.

        DNS is working inside the cluster. The nodes can all resolve each other no problem. But the ip range is private inside.

        I feel like I’m missing a step here. I reviewed and followed your directions above multiple times, but sitting here at my laptop, I cannot open the console. Does the DNS need to be resolvable outside the KVM host as well? Are there ports that need to be forwarded on the KVM host?

        • I had the same issue. It is possible the tutorial assumes the hypervisor host has a gui and just dropped to the terminal?

          In any case, I spun up a VM with a GUI with both openshift and default networks and am finally at the web console.

  11. Hi Jospat, Amazing work. Works like a charm. I knew Kuberenetes and all but OCP installation I could not find a clear instructoin on how to install. Your guide was the guiding light for me. Thanks a Ton. I would like to buy you 10 coffees. Please DM me.

  12. I installed this, but when I try to validate things, “oc whoami”, I get the following error. DNS seems to be picking it up, but not sure where else to look: (newbie)

    oc whoami
    Unable to connect to the server: dial tcp: lookup api.ocp4.nsidedabox.local: no such host
    [phillip@fedora ~]$ nslookup api.ocp4.nsidedabox.local
    Server: 192.168.4.5
    Address: 192.168.4.5#53

    Name: api.ocp4.nsidedabox.local
    Address: 192.168.100.254

  13. While I have deployed well over 100 OCP clusters starting at 4.2 with the latest at 4.11.0 I have a need to build an OCP 4.10 cluster on KVM, and that is a platform Red Hat only has documentation for with LinuxONE. So your guide is very valuable and much appreciated. I though am curious if you have a similar guide using RHEL or CentOS 8 worker nodes, in place of RHCOS? My next demo cluster needs to have RHEL or CentOS worker nodes and I’d like to run on KVM if possible. Thanks

  14. My cluster works, however when I try to login using “oc login -u ” I get error: x509: certificate signed by unknown authority all the time.
    What can I do to mitigate this?

  15. When I create the Bootstrap Virtual Machine using command below:

    sudo virt-install -n bootstrap.ocp4.example.com \
    –description “Bootstrap Machine for Openshift 4 Cluster” \
    –ram=8192 \
    –vcpus=4 \
    –os-type=Linux \
    –os-variant=rhel8.0 \
    –noreboot \
    –disk pool=default,bus=virtio,size=50 \
    –graphics none \
    –serial pty \
    –console pty \
    –pxe \
    –network bridge=openshift4,mac=52:54:00:a4:db:5f

    It stucks at below:

    [** ] A start job is running for Acquire …ootfs image (51min 53s / no limit)[ 3118.047630] coreos-livepxe-rootfs[865]: curl: (7) Failed to connect to 192.168.100.254 port 8080: Connection timed out
    [ 3118.052072] coreos-livepxe-rootfs[805]: Couldn’t establish connectivity with the server specified by:
    [ 3118.055647] coreos-livepxe-rootfs[805]: coreos.live.rootfs_url=http://192.168.100.254:8080/rhcos/rootfs.img
    [ 3118.059331] coreos-livepxe-rootfs[805]: Retrying in 5s…
    [ * coreos-livepxe-rootfs[867]: curl: (7) Failed to connect to 192.168.100.254 port 8080: Connection timed out

    Can you shed a light on this?

  16. hi Josphat Mutai

    can you advise on my problem when firing up the master node :

    GET error : Get “https://api-int.ocp4.example.com:22623/config/master”

    I can do telnet to the port successfully from bastion host, so it means ther service is working fine on port 22623….but when i try to access the specific url using curl…..this is the error i got :

    OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api-int.ocp4.example.com:22623
    curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api-int.ocp4.example.com:22623

    please advise.

  17. I don’t have much resource on my machine, is that possible to install only one master and one worker node, and it will still work?

  18. When I tried creating bootstrap machine in virsh

    sudo virt-install -n bootstrap.ocp4.example.com –description “Bootstrap Machine for Openshift 4 Cluster” –ram=4096 –vcpus=4 –os-type=Linux –os-variant=rhel8.4 –noreboot –disk pool=default,bus=virtio,size=50 –graphics none –serial pty –console pty –pxe –network bridge=openshift4,mac=52:54:00:a4:db:5f

    The issue I got is,

    [FAILED] Failed to start Rebuild Journal Catalog.
    See ‘systemctl status systemd-journal-catalog-update.service’ for details.
    :
    :
    :
    [ OK ] Stopped Update is Completed.
    [ OK ] Stopped Rebuild Hardware Database.
    [ OK ] Stopped Rebuild Dynamic Linker Cache.
    [ OK ] Stopped CoreOS: Set printk To Level 4 (warn).
    [ OK ] Stopped target Local Encrypted Volumes.
    [ OK ] Stopped target Local Encrypted Volumes (Pre).
    [ OK ] Stopped Dispatch Password Requests to Console Directory Watch.
    [ OK ] Stopped Load Kernel Modules.
    [ OK ] Stopped Load/Save Random Seed.
    [ OK ] Stopped Update UTMP about System Boot/Shutdown.
    Unmounting var.mount…
    [ OK ] Stopped Create Volatile Files and Directories.
    [ OK ] Stopped target Local File Systems.
    Unmounting Temporary Directory (/tmp)…
    Unmounting /run/ephemeral…
    Unmounting /etc…
    [FAILED] Failed unmounting var.mount.
    [FAILED] Failed unmounting /etc.
    [ OK ] Unmounted Temporary Directory (/tmp).
    [ OK ] Stopped target Swap.
    [ OK ] Unmounted /run/ephemeral.
    [ OK ] Reached target Unmount All Filesystems.
    [ OK ] Stopped target Local File Systems (Pre).
    [ OK ] Stopped Create Static Device Nodes in /dev.
    [ OK ] Stopped Create System Users.
    Stopping Monitoring of LVM2 mirrors…ng dmeventd or progress polling…
    [ OK ] Stopped Monitoring of LVM2 mirrors,…sing dmeventd or progress polling.
    [ OK ] Reached target Shutdown.
    [ OK ] Reached target Final Step.
    [ OK ] Started Reboot.
    [ OK ] Reached target Reboot.
    [ 68.354526] watchdog: watchdog0: watchdog did not stop!
    [ 68.458591] reboot: Restarting system

    Why I’m getting this error

    [FAILED] Failed to start Rebuild Journal Catalog.
    [FAILED] Failed unmounting var.mount.
    [FAILED] Failed unmounting /etc.

    • Same here, and it then prevents the automation of the whole cluster. I should mention that I use a nested virtualization.

  19. Hello Josphat,

    thanks for the step by step installation guide which is well defined !
    Could you please tell me that in order to add additional worker after cluster initiated which steps I need to use ? Is there anyway to use existing yaml files to add worker to existing cluster ?

    Thank you,

    Leo

  20. Josphat,

    Great article. I have been trying for 2 months to do this. It’s fairly complex. I now have a running cluster and i’m starting to play with it. I used fedora38 without any problems. I used Network Manager so I had to change the bastion network setup. Everything else ran pretty much without a hitch. Thank you for doing this.

    Brian

  21. Hi Josphat,
    your post was very helpful for us to change our existing ocp installation tool. We used a old “helpernodectl binary” and ocp 4.13 didn’t got installed with this tool.
    With you block we were be able to remove the helpernodectl binary tool and implement all steps qith ansibl. We have now one Ansible playbook that runs on KVM and VMware.
    I have only one open question to the step with the TFTP service.
    Why do we need to create this “start-tftp.sh” script ?
    What will be happen if we don’t have this script ?
    You may can give us a short explanation !
    I can sy create job ! Thank you

  22. Current openshift-installer from that wget is 4.13.9 and when i try to create manifest its missing the manifests/04-openshift-machine-config-operator.yaml file. So does procedure still apply to version later than 4.13.4 which is used in above example ? i did continue and built bootstrap without error but it cannot start kubelet.service and ‘get pods’ fails with
    “E0821 11:22:35.502291 6321 memcache.go:238] couldn’t get current server API group list: Get “https://api.ocp4.example.com:6443/api?timeout=32s”: EOF
    Unable to connect to the server: EOF”
    I can see the port 6443 is in LISTEN mode so wondering if something changed? Any help please?

  23. Hi. Excellent insight. Thank you!
    Has anyone ever stumbled of the error below ? The bootstrapper seems to be working fine but it seems none of the masters are able to join. The LAB is running on fedora38;poweredge;256GB RAM.

    Aug 30 18:30:02 master02.ocp4.example.com kubenswrapper[2338]: E0830 18:30:02.176004 2338 kubelet.go:2495] “Container runtime network not ready” networkReady=”NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?”

  24. I think that this error is just a result of some previous errors. The operator controllers are not fully comming up. Internet is fast. So ist the mem/cpu power. Will give it some more shots…

  25. The `openshift-install` is not showing kvm platform, any idea how to choose the platform ?

    ““
    $ openshift-install create ignition-configs
    ? SSH Public Key /root/.ssh/id_rsa.pub
    ? Platform [Use arrows to move, type to filter, ? for more help]
    gcp
    ibmcloud
    nutanix
    openstack
    > ovirt
    powervs
    vsphere

  26. Thank you for the helpful article. By using it, I was able to successfully stand up an OpenShift 4.17 libvirt VM cluster. However, I had to modify the `virt-install` command for creating the control node/s. The command uses the argument `–ram=8192`. However, this does not match the “preferred” amount of memory (16GB) specified in the beginning of the article.

    I was having a problem during the installation where my worker nodes were not coming up. When I checked resources on the control node, output showed that memory use was close to 100% when trying to bring up worker nodes.

    “`
    oc describe node master01.ocp4.local
    “`

    “`
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.)
    Resource Requests Limits
    ——– ——– ——
    cpu 1306m (37%) 10m (0%)
    memory 6537Mi (95%) 0 (0%)
    ephemeral-storage 0 (0%) 0 (0%)
    hugepages-1Gi 0 (0%) 0 (0%)
    hugepages-2Mi 0 (0%) 0 (0%)
    “`

    When I destroyed and undefined the control node and created it with `–ram=16384`, the three worker nodes successfully came up. Checking the control node’s resources later, more memory than the originally allocated 8GB is needed:

    “`
    Allocated resources:
    (Total limits may be over 100 percent, i.e., overcommitted.)
    Resource Requests Limits
    ——– ——– ——
    cpu 2236m (29%) 10m (0%)
    memory 9501Mi (63%) 0 (0%)
    ephemeral-storage 0 (0%) 0 (0%)
    hugepages-1Gi 0 (0%) 0 (0%)
    hugepages-2Mi 0 (0%) 0 (0%)
    “`

LEAVE A REPLY

Please enter your comment!
Please enter your name here