OpenShift Container Platform is Red Hat’s enterprise Kubernetes distribution that automates installation, upgrades, and lifecycle management for containerized applications. All cluster nodes run Red Hat CoreOS (RHCOS), which includes the kubelet and CRI-O container runtime optimized for Kubernetes workloads.
This guide walks through deploying OpenShift 4.17 on a single KVM hypervisor using libvirt, with a bastion VM handling DNS, DHCP, load balancing, and PXE boot services. The KVM host runs RHEL 10, Rocky Linux 10, or Fedora. This setup works for proof-of-concept, lab, and development environments – production deployments need multiple hypervisors with HA load balancers.
Prerequisites for Deploying OpenShift on KVM
Before starting, make sure you have the following in place:
- A bare-metal server or workstation running RHEL 10, Rocky Linux 10, or Fedora with KVM support (Intel VT-x or AMD-V)
- At least 128 GB RAM and 16 CPU cores (for 1 bootstrap + 3 masters + 3 workers)
- 600 GB+ free disk space on
/var/lib/libvirt/images - Root or sudo access on the hypervisor
- A Red Hat account with an active subscription to download the pull secret from console.redhat.com
- A registered domain name or internal DNS zone (we use
example.comthroughout this guide)
Hardware Requirements per VM
Red Hat’s minimum and recommended hardware requirements for each cluster VM are shown below.
| VM Role | OS | vCPU (Min/Rec) | RAM | Storage |
|---|---|---|---|---|
| Bootstrap | RHCOS | 4 / 4 | 16 GB | 120 GB |
| Control Plane | RHCOS | 4 / 8 | 16 GB | 120 GB |
| Compute (Worker) | RHCOS | 2 / 4 | 8 GB | 120 GB |
| Bastion/Helper | Fedora/RHEL | 2 | 4 GB | 20 GB |
Lab Network Layout
Here is the network layout and IP allocation used throughout this guide. Replace domain, cluster name, and IP addresses with your own values.
- Base domain: example.com
- Cluster name: ocp4
- KVM virtual network: openshift4 (192.168.100.0/24)
- Gateway: 192.168.100.1
- Bastion node (DNS, DHCP, HAProxy, TFTP, httpd): 192.168.100.254
| Hostname | MAC Address | IP Address |
|---|---|---|
| bootstrap.ocp4.example.com | 52:54:00:a4:db:5f | 192.168.100.10 |
| master01.ocp4.example.com | 52:54:00:8b:a1:17 | 192.168.100.11 |
| master02.ocp4.example.com | 52:54:00:ea:8b:9d | 192.168.100.12 |
| master03.ocp4.example.com | 52:54:00:f8:87:c7 | 192.168.100.13 |
| worker01.ocp4.example.com | 52:54:00:31:4a:39 | 192.168.100.21 |
| worker02.ocp4.example.com | 52:54:00:6a:37:32 | 192.168.100.22 |
| worker03.ocp4.example.com | 52:54:00:95:d4:ed | 192.168.100.23 |
Generate unique MAC addresses for your setup with this command:
date +%s | md5sum | head -c 6 | sed -e 's/\([0-9A-Fa-f]\{2\}\)/\1:/g' -e 's/\(.*\):$/\1/' | sed -e 's/^/52:54:00:/';echo
Step 1: Install KVM and Libvirt on the Hypervisor
First, confirm that your CPU supports hardware virtualization.
$ grep -cE 'vmx|svm' /proc/cpuinfo
8
Any value greater than 0 means virtualization extensions are present. Install the KVM/libvirt packages on your hypervisor. For RHEL 10 or Rocky Linux 10, you can install KVM as a KVM virtualization host with these commands:
sudo dnf install -y qemu-kvm libvirt virt-install virt-viewer \
bridge-utils libguestfs-tools virt-manager libvirt-client
Enable and start the libvirtd service.
sudo systemctl enable --now libvirtd
Verify libvirtd is running.
$ sudo systemctl status libvirtd
● libvirtd.service - Virtualization daemon
Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; enabled)
Active: active (running)
Create the OpenShift Virtual Network
Create a dedicated NAT virtual network for the OpenShift cluster. Write the network definition file.
sudo vim /tmp/openshift4-net.xml
Add the following content:
<network>
<name>openshift4</name>
<forward mode='nat'>
<nat>
<port start='1024' end='65535'/>
</nat>
</forward>
<bridge name='openshift4' stp='on' delay='0'/>
<domain name='openshift4'/>
<ip address='192.168.100.1' netmask='255.255.255.0'>
</ip>
</network>
Define, autostart, and start the network.
$ sudo virsh net-define --file /tmp/openshift4-net.xml
Network openshift4 defined from /tmp/openshift4-net.xml
$ sudo virsh net-autostart openshift4
Network openshift4 marked as autostarted
$ sudo virsh net-start openshift4
Network openshift4 started
Confirm the bridge interface is active.
$ ip link show openshift4
5: openshift4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 state DOWN
link/ether 52:54:00:2b:47:9a brd ff:ff:ff:ff:ff:ff
Step 2: Create the Bastion / Helper VM
The bastion VM hosts all supporting infrastructure services for the OpenShift deployment: BIND DNS, DHCP, HAProxy load balancer, TFTP/PXE boot, and Apache httpd for serving RHCOS images and ignition files. It also serves as the management workstation where we run the openshift-install and oc commands.
Build a Fedora 41 VM image using virt-builder. You can also use a Rocky Linux or CentOS Stream image.
sudo virt-builder fedora-41 --format qcow2 \
--size 20G -o /var/lib/libvirt/images/ocp-bastion-server.qcow2 \
--root-password password:StrongRootPassw0rd
Create and import the VM.
sudo virt-install \
--name ocp-bastion-server \
--ram 4096 \
--vcpus 2 \
--disk path=/var/lib/libvirt/images/ocp-bastion-server.qcow2 \
--os-variant fedora41 \
--network bridge=openshift4 \
--graphics none \
--serial pty \
--console pty \
--boot hd \
--import
Log in as root with the password set above. Set a static IP on the bastion VM so all cluster nodes can reach it consistently.
nmcli con delete "Wired connection 1"
nmcli con add type ethernet con-name enp1s0 ifname enp1s0 \
connection.autoconnect yes ipv4.method manual \
ipv4.address 192.168.100.254/24 ipv4.gateway 192.168.100.1 \
ipv4.dns 8.8.8.8
Test connectivity from the bastion VM.
# ping -c 2 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=4.98 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=117 time=5.14 ms
Update the OS and install required packages.
sudo dnf -y upgrade
sudo dnf -y install git vim wget curl bash-completion tree tar firewalld bind-utils
Reboot after the upgrade completes.
sudo reboot
Reconnect to the bastion via virsh console or SSH from the hypervisor.
$ sudo virsh console ocp-bastion-server
Connected to domain 'ocp-bastion-server'
Escape character is ^] (Ctrl + ])
Enable domain autostart so the bastion VM starts automatically when the hypervisor boots.
sudo virsh autostart ocp-bastion-server
Step 3: Configure Ansible Variables on Bastion
We use Ansible to automate configuration of DNS, DHCP, HAProxy, and PXE on the bastion node. Install Ansible on the bastion.
sudo dnf -y install ansible-core vim wget curl bash-completion tree tar
Clone the OCP4 Ansible helper repository that contains all tasks and Jinja2 templates.
cd ~/
git clone https://github.com/jmutai/ocp4_ansible.git
cd ~/ocp4_ansible
Edit the variables file to match your lab environment. Every IP, MAC address, domain, and cluster name must be correct – errors here will break the entire deployment.
vim vars/main.yml
Set all values carefully. Here is the full variables file with comments explaining each field:
---
ppc64le: false
uefi: false
disk: vda
helper:
name: "bastion"
ipaddr: "192.168.100.254"
networkifacename: "ens3"
dns:
domain: "example.com"
clusterid: "ocp4"
forwarder1: "8.8.8.8"
forwarder2: "1.1.1.1"
lb_ipaddr: "{{ helper.ipaddr }}"
dhcp:
router: "192.168.100.1"
bcast: "192.168.100.255"
netmask: "255.255.255.0"
poolstart: "192.168.100.10"
poolend: "192.168.100.50"
ipid: "192.168.100.0"
netmaskid: "255.255.255.0"
ntp: "time.google.com"
dns: ""
bootstrap:
name: "bootstrap"
ipaddr: "192.168.100.10"
macaddr: "52:54:00:a4:db:5f"
masters:
- name: "master01"
ipaddr: "192.168.100.11"
macaddr: "52:54:00:8b:a1:17"
- name: "master02"
ipaddr: "192.168.100.12"
macaddr: "52:54:00:ea:8b:9d"
- name: "master03"
ipaddr: "192.168.100.13"
macaddr: "52:54:00:f8:87:c7"
workers:
- name: "worker01"
ipaddr: "192.168.100.21"
macaddr: "52:54:00:31:4a:39"
- name: "worker02"
ipaddr: "192.168.100.22"
macaddr: "52:54:00:6a:37:32"
- name: "worker03"
ipaddr: "192.168.100.23"
macaddr: "52:54:00:95:d4:ed"
Verify the ansible.cfg and inventory files are correctly set.
$ cat ansible.cfg
[defaults]
inventory = inventory
host_key_checking = False
deprecation_warnings = False
retry_files = false
$ cat inventory
[vms_host]
localhost ansible_connection=local
Step 4: Install and Configure DHCP Server
The DHCP server assigns fixed IP addresses to each cluster node based on its MAC address. This is critical because OpenShift nodes must get consistent IPs that match the DNS records. Install the dhcp-server package on the bastion.
sudo dnf -y install dhcp-server
Enable the dhcpd service.
sudo systemctl enable dhcpd
Backup the default configuration and run the Ansible playbook to generate the correct DHCP config from your variables.
sudo mv /etc/dhcp/dhcpd.conf /etc/dhcp/dhcpd.conf.bak
ansible-playbook tasks/configure_dhcpd.yml
Expected output shows the template being applied and dhcpd restarting.
PLAY RECAP *************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=0 skipped=1
Confirm dhcpd is running.
$ systemctl status dhcpd
● dhcpd.service - DHCPv4 Server Daemon
Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; enabled)
Active: active (running)
Step 5: Configure BIND DNS Server
OpenShift requires specific DNS records for API, etcd SRV records, and wildcard entries for applications. Without correct DNS, the cluster installation fails. Install BIND on the bastion.
sudo dnf -y install bind bind-utils
Enable the named service.
sudo systemctl enable named
Install the DNS serial number generator script used by the Ansible templates.
sudo cp files/set-dns-serial.sh /usr/local/bin/
sudo chmod a+x /usr/local/bin/set-dns-serial.sh
Run the DNS configuration playbook. This creates both the forward zone (ocp4.example.com) and the reverse zone files with all required A, SRV, and CNAME records.
$ ansible-playbook tasks/configure_bind_dns.yml
PLAY RECAP *************************************************************
localhost : ok=7 changed=6 unreachable=0 failed=0 skipped=0
Verify the DNS service is running and responding correctly. Test the etcd SRV records that OpenShift requires.
$ dig @127.0.0.1 -t srv _etcd-server-ssl._tcp.ocp4.example.com +short
0 10 2380 etcd-0.ocp4.example.com.
0 10 2380 etcd-1.ocp4.example.com.
0 10 2380 etcd-2.ocp4.example.com.
Test the API and wildcard DNS entries.
$ dig @127.0.0.1 api.ocp4.example.com +short
192.168.100.254
$ dig @127.0.0.1 *.apps.ocp4.example.com +short
192.168.100.254
$ dig @127.0.0.1 bootstrap.ocp4.example.com +short
192.168.100.10
Point the bastion’s own DNS resolver to itself.
nmcli connection modify enp1s0 ipv4.dns "192.168.100.254"
nmcli connection reload
nmcli connection up enp1s0
Open the required firewall ports on the bastion for all services.
sudo firewall-cmd --add-service={dhcp,tftp,http,https,dns} --permanent
sudo firewall-cmd --reload
Step 6: Setup TFTP and PXE Boot Services
PXE boot allows the OpenShift cluster VMs to network-boot directly into the RHCOS installer without manual ISO mounting. Install TFTP and syslinux packages.
sudo dnf -y install tftp-server syslinux
Enable and start the TFTP service.
sudo systemctl enable --now tftp
Prepare the PXE boot directory structure.
sudo mkdir -p /var/lib/tftpboot/pxelinux.cfg
sudo cp -rvf /usr/share/syslinux/* /var/lib/tftpboot
sudo mkdir -p /var/lib/tftpboot/rhcos
Download the RHCOS kernel and initramfs files from the RHCOS image mirror.
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-installer-kernel-x86_64
sudo mv rhcos-installer-kernel-x86_64 /var/lib/tftpboot/rhcos/kernel
Download the initramfs image.
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-installer-initramfs.x86_64.img
sudo mv rhcos-installer-initramfs.x86_64.img /var/lib/tftpboot/rhcos/initramfs.img
Restore SELinux context on the TFTP files.
sudo restorecon -RFv /var/lib/tftpboot/rhcos
Configure Apache httpd for RHCOS Images
Apache serves the rootfs image and ignition files to the nodes during PXE boot. Install httpd.
sudo dnf -y install httpd
Change httpd to listen on port 8080 instead of 80, since HAProxy will use ports 80 and 443. Edit /etc/httpd/conf/httpd.conf and change the Listen directive.
sudo sed -i 's/^Listen 80$/Listen 8080/' /etc/httpd/conf/httpd.conf
Remove the default welcome page and start httpd.
sudo rm -f /etc/httpd/conf.d/welcome.conf
sudo systemctl enable --now httpd
Open port 8080 in the firewall.
sudo firewall-cmd --add-port=8080/tcp --permanent
sudo firewall-cmd --reload
Download the RHCOS rootfs image and place it in the httpd directory.
sudo mkdir -p /var/www/html/rhcos
wget https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/latest/rhcos-live-rootfs.x86_64.img
sudo mv rhcos-live-rootfs.x86_64.img /var/www/html/rhcos/rootfs.img
sudo restorecon -RFv /var/www/html/rhcos
Now run the Ansible playbook to generate the PXE boot configuration files for each node. Each file is named using the node’s MAC address and contains the correct kernel arguments, rootfs URL, and ignition file URL.
$ ansible-playbook tasks/configure_tftp_pxe.yml
PLAY RECAP *************************************************************
localhost : ok=5 changed=4 unreachable=0 failed=0 skipped=0
Verify the PXE config files were created for all nodes.
$ ls -1 /var/lib/tftpboot/pxelinux.cfg/
01-52-54-00-31-4a-39
01-52-54-00-6a-37-32
01-52-54-00-8b-a1-17
01-52-54-00-95-d4-ed
01-52-54-00-a4-db-5f
01-52-54-00-ea-8b-9d
01-52-54-00-f8-87-c7
Step 7: Configure HAProxy Load Balancer
HAProxy distributes traffic across the OpenShift API servers (port 6443), machine config server (port 22623), and application routes (ports 80 and 443). Install HAProxy.
sudo dnf install -y haproxy
Allow HAProxy to bind to any port via SELinux.
sudo setsebool -P haproxy_connect_any 1
Back up the default config and run the Ansible playbook to generate the OpenShift-specific HAProxy configuration.
sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.default
ansible-playbook tasks/configure_haproxy_lb.yml
Configure SELinux for the custom ports.
sudo semanage port -a 6443 -t http_port_t -p tcp
sudo semanage port -a 22623 -t http_port_t -p tcp
sudo semanage port -a 32700 -t http_port_t -p tcp
Open firewall ports for API and application traffic.
sudo firewall-cmd --add-service={http,https} --permanent
sudo firewall-cmd --add-port={6443,22623}/tcp --permanent
sudo firewall-cmd --reload
Confirm HAProxy is active.
$ systemctl status haproxy
● haproxy.service - HAProxy Load Balancer
Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled)
Active: active (running)
Step 8: Install OpenShift Installer and CLI
Download the latest OpenShift 4.17 client and installer binaries from the official mirror. Run these on the bastion node.
wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-client-linux.tar.gz
tar xvf openshift-client-linux.tar.gz
sudo mv oc kubectl /usr/local/bin
rm -f README.md LICENSE openshift-client-linux.tar.gz
Download and install the openshift-install binary.
wget https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest/openshift-install-linux.tar.gz
tar xvf openshift-install-linux.tar.gz
sudo mv openshift-install /usr/local/bin
rm -f README.md LICENSE openshift-install-linux.tar.gz
Verify all three binaries work.
$ openshift-install version
openshift-install 4.17.9
built from commit 15a9f9e5af6dee9caece196f8d20e6b3cd0cc2cc
release image quay.io/openshift-release-dev/ocp-release@sha256:cc7433a6c0a3e698630992bd8d63dbbc25659dfc292d4bfa6882a977010e12a3
$ oc version --client
Client Version: 4.17.9
$ kubectl version --client
Client Version: v1.30.5
Generate an SSH key pair for accessing CoreOS nodes during troubleshooting.
ssh-keygen -t rsa -N "" -f ~/.ssh/id_rsa
Step 9: Generate Ignition Files
Ignition files contain the cluster configuration that RHCOS nodes consume during first boot. You need a pull secret from Red Hat and an install-config.yaml file to generate them.
Download the Pull Secret
Visit console.redhat.com and download your pull secret. Save it on the bastion.
mkdir -p ~/.openshift
vim ~/.openshift/pull-secret
Paste the pull secret JSON content and save the file.
Create install-config.yaml
Create the base installation configuration file. The baseDomain must match your DNS domain, and metadata.name must match the cluster ID configured in DNS.
vim ~/install-config-base.yaml
Add the following content, replacing the pull secret and SSH key with your actual values:
apiVersion: v1
baseDomain: example.com
compute:
- hyperthreading: Enabled
name: worker
replicas: 0
controlPlane:
hyperthreading: Enabled
name: master
replicas: 3
metadata:
name: ocp4
networking:
clusterNetworks:
- cidr: 10.128.0.0/14
hostPrefix: 23
networkType: OVNKubernetes
serviceNetwork:
- 172.30.0.0/16
platform:
none: {}
fips: false
pullSecret: 'PASTE-YOUR-PULL-SECRET-HERE'
sshKey: 'PASTE-YOUR-SSH-PUBLIC-KEY-HERE'
The compute.replicas: 0 setting means workers will be added manually after the initial install. The platform: none tells OpenShift this is a bare-metal/UPI deployment. OVNKubernetes is the default network plugin for OpenShift 4.17.
Create the working directory and copy the config into it. The openshift-install command deletes install-config.yaml during manifest generation, so always keep a backup copy.
mkdir -p ~/ocp4
cp ~/install-config-base.yaml ~/ocp4/install-config.yaml
cd ~/ocp4
Generate the Kubernetes manifests.
$ openshift-install create manifests
INFO Consuming Install Config from target directory
INFO Manifests created in: manifests and openshift
Disable pod scheduling on master nodes so only workers run application workloads.
sed -i 's/mastersSchedulable: true/mastersSchedulable: false/' manifests/cluster-scheduler-02-config.yml
Generate the ignition files.
$ openshift-install create ignition-configs
INFO Consuming Common Manifests from target directory
INFO Consuming OpenShift Install (Manifests) from target directory
INFO Consuming Master Machines from target directory
INFO Consuming Worker Machines from target directory
INFO Consuming Openshift Manifests from target directory
INFO Ignition-Configs created in: . and auth
Verify the files were created.
$ ls ~/ocp4/
auth bootstrap.ign master.ign metadata.json worker.ign
Copy the ignition files to the httpd server directory so nodes can fetch them during PXE boot.
sudo mkdir -p /var/www/html/ignition
sudo cp -v ~/ocp4/*.ign /var/www/html/ignition/
sudo chmod 644 /var/www/html/ignition/*.ign
sudo restorecon -RFv /var/www/html/
Ensure all bastion services are running before creating cluster VMs.
sudo systemctl enable --now haproxy dhcpd httpd tftp named
sudo systemctl restart haproxy dhcpd httpd tftp named
Step 10: Create Bootstrap, Master, and Worker VMs
All commands in this step run on the KVM hypervisor host, not the bastion VM. Each VM PXE boots from the network, pulls the correct ignition file based on its MAC address, installs RHCOS, and reboots.
Bootstrap Node
Create the bootstrap VM first. It initializes the etcd cluster and control plane components.
sudo virt-install -n bootstrap.ocp4.example.com \
--description "Bootstrap Machine for OpenShift 4 Cluster" \
--ram=16384 \
--vcpus=4 \
--os-variant=rhel9-unknown \
--noreboot \
--disk pool=default,bus=virtio,size=120 \
--graphics none \
--serial pty \
--console pty \
--pxe \
--network bridge=openshift4,mac=52:54:00:a4:db:5f
Once the PXE install completes and the VM shuts down, start it.
sudo virsh start bootstrap.ocp4.example.com
Master Nodes
Create all three master (control plane) nodes.
sudo virt-install -n master01.ocp4.example.com \
--description "Master01 for OpenShift 4 Cluster" \
--ram=16384 \
--vcpus=8 \
--os-variant=rhel9-unknown \
--noreboot \
--disk pool=default,bus=virtio,size=120 \
--graphics none \
--serial pty \
--console pty \
--pxe \
--network bridge=openshift4,mac=52:54:00:8b:a1:17
Repeat for master02 and master03 with their respective MAC addresses.
sudo virt-install -n master02.ocp4.example.com \
--description "Master02 for OpenShift 4 Cluster" \
--ram=16384 \
--vcpus=8 \
--os-variant=rhel9-unknown \
--noreboot \
--disk pool=default,bus=virtio,size=120 \
--graphics none \
--serial pty \
--console pty \
--pxe \
--network bridge=openshift4,mac=52:54:00:ea:8b:9d
Create master03.
sudo virt-install -n master03.ocp4.example.com \
--description "Master03 for OpenShift 4 Cluster" \
--ram=16384 \
--vcpus=8 \
--os-variant=rhel9-unknown \
--noreboot \
--disk pool=default,bus=virtio,size=120 \
--graphics none \
--serial pty \
--console pty \
--pxe \
--network bridge=openshift4,mac=52:54:00:f8:87:c7
Start all master nodes after RHCOS installation completes.
sudo virsh start master01.ocp4.example.com
sudo virsh start master02.ocp4.example.com
sudo virsh start master03.ocp4.example.com
Worker Nodes
Create the worker nodes using a similar process.
for i in 01 02 03; do
case $i in
01) MAC="52:54:00:31:4a:39" ;;
02) MAC="52:54:00:6a:37:32" ;;
03) MAC="52:54:00:95:d4:ed" ;;
esac
sudo virt-install -n worker${i}.ocp4.example.com \
--description "Worker${i} for OpenShift 4 Cluster" \
--ram=8192 \
--vcpus=4 \
--os-variant=rhel9-unknown \
--noreboot \
--disk pool=default,bus=virtio,size=120 \
--graphics none \
--serial pty \
--console pty \
--pxe \
--network bridge=openshift4,mac=$MAC
done
Start the worker nodes after installation completes.
sudo virsh start worker01.ocp4.example.com
sudo virsh start worker02.ocp4.example.com
sudo virsh start worker03.ocp4.example.com
You can check PXE boot and DHCP logs on the bastion to troubleshoot any issues.
journalctl -f -u tftp
journalctl -f -u dhcpd
Step 11: Monitor Installation Progress
Back on the bastion node, monitor the bootstrap process. This step waits for the bootstrap node to bring up the temporary control plane and hand off to the master nodes.
cd ~/ocp4
openshift-install wait-for bootstrap-complete --log-level=info
Expected output when bootstrap succeeds:
INFO Waiting up to 20m0s for the Kubernetes API at https://api.ocp4.example.com:6443...
INFO API v1.30.5+5765b01 up
INFO Waiting up to 30m0s for bootstrapping to complete...
INFO It is now safe to remove the bootstrap resources
INFO Time elapsed: 15m22s
Once you see the “safe to remove bootstrap” message, shut down and delete the bootstrap VM from the hypervisor to free resources.
sudo virsh destroy bootstrap.ocp4.example.com
sudo virsh undefine bootstrap.ocp4.example.com --remove-all-storage
Also remove the bootstrap backend from the HAProxy configuration on the bastion. Edit /etc/haproxy/haproxy.cfg and comment out or remove the bootstrap server lines from the api-server and machine-config-server backends. Then restart HAProxy.
sudo systemctl restart haproxy
Approve Worker CSRs
Worker nodes need their certificate signing requests (CSRs) approved before they join the cluster. Export the kubeconfig and check for pending CSRs.
export KUBECONFIG=~/ocp4/auth/kubeconfig
List pending CSRs.
$ oc get csr | grep Pending
csr-abc12 3m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-def34 3m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
csr-ghi56 3m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending
Approve all pending CSRs. You may need to run this command twice as a second set of CSRs appears after the first batch is approved.
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
Wait for the full installation to complete.
$ openshift-install wait-for install-complete --log-level=info
INFO Waiting up to 40m0s for the cluster at https://api.ocp4.example.com:6443 to initialize...
INFO Waiting up to 10m0s for the openshift-console route to be created...
INFO Install complete!
INFO To access the cluster as the system:admin user when using 'oc', run
export KUBECONFIG=/root/ocp4/auth/kubeconfig
INFO Access the OpenShift web console here: https://console-openshift-console.apps.ocp4.example.com
INFO Login to the console with user: "kubeadmin", password: "AbCdE-FgHiJ-KlMnO-PqRsT"
INFO Time elapsed: 28m15s
Verify all nodes are in Ready state.
$ oc get nodes
NAME STATUS ROLES AGE VERSION
master01.ocp4.example.com Ready control-plane 30m v1.30.5+5765b01
master02.ocp4.example.com Ready control-plane 30m v1.30.5+5765b01
master03.ocp4.example.com Ready control-plane 30m v1.30.5+5765b01
worker01.ocp4.example.com Ready worker 10m v1.30.5+5765b01
worker02.ocp4.example.com Ready worker 10m v1.30.5+5765b01
worker03.ocp4.example.com Ready worker 10m v1.30.5+5765b01
Check that all cluster operators are available.
$ oc get clusteroperators
NAME VERSION AVAILABLE PROGRESSING DEGRADED
authentication 4.17.9 True False False
cloud-credential 4.17.9 True False False
cluster-autoscaler 4.17.9 True False False
config-operator 4.17.9 True False False
console 4.17.9 True False False
dns 4.17.9 True False False
etcd 4.17.9 True False False
...
Step 12: Access the OpenShift Web Console
The OpenShift web console is accessible at https://console-openshift-console.apps.ocp4.example.com. Since this is a lab environment, you need to resolve this DNS name to the bastion IP (192.168.100.254) from your workstation. Add these entries to your workstation’s /etc/hosts file or configure your workstation to use the bastion as its DNS server.
192.168.100.254 console-openshift-console.apps.ocp4.example.com
192.168.100.254 oauth-openshift.apps.ocp4.example.com
192.168.100.254 api.ocp4.example.com
Log in using the kubeadmin credentials displayed at the end of the installation. The password is stored in ~/ocp4/auth/kubeadmin-password on the bastion.
cat ~/ocp4/auth/kubeadmin-password
You can also log in from the command line.
oc login -u kubeadmin -p $(cat ~/ocp4/auth/kubeadmin-password) https://api.ocp4.example.com:6443
Step 13: Add Additional Worker Nodes
To add more worker nodes after the initial deployment, generate a new MAC address for the node, add it to the Ansible variables, regenerate the DHCP and DNS configs, and create a new PXE boot entry. Then create the VM using virt-install with PXE boot and approve the pending CSR.
Update the vars/main.yml file with the new worker details.
vim ~/ocp4_ansible/vars/main.yml
Add a new entry under the workers section, for example:
- name: "worker04"
ipaddr: "192.168.100.24"
macaddr: "52:54:00:aa:bb:cc"
Re-run the Ansible playbooks to update DHCP, DNS, and PXE.
cd ~/ocp4_ansible
ansible-playbook tasks/configure_dhcpd.yml
ansible-playbook tasks/configure_bind_dns.yml
ansible-playbook tasks/configure_tftp_pxe.yml
Create the new worker VM on the hypervisor and approve its CSR after it boots.
oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
Step 14: Setup Persistent Storage with NFS
OpenShift needs persistent storage for the internal image registry and stateful workloads. For a lab setup, NFS on Rocky Linux is a quick option. Configure an NFS export on the bastion or a dedicated storage node.
Install the NFS server on the bastion.
sudo dnf install -y nfs-utils
sudo systemctl enable --now nfs-server
Create the export directory and set permissions.
sudo mkdir -p /srv/nfs/ocp-registry
sudo chmod 777 /srv/nfs/ocp-registry
Add the export to /etc/exports.
echo "/srv/nfs/ocp-registry 192.168.100.0/24(rw,sync,no_root_squash,no_subtree_check)" | sudo tee -a /etc/exports
Export the shares and open the firewall.
sudo exportfs -rav
sudo firewall-cmd --add-service=nfs --permanent
sudo firewall-cmd --reload
Create a PersistentVolume in OpenShift for the image registry.
oc apply -f - <<YAML
apiVersion: v1
kind: PersistentVolume
metadata:
name: image-registry-pv
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
nfs:
path: /srv/nfs/ocp-registry
server: 192.168.100.254
persistentVolumeReclaimPolicy: Retain
YAML
Create the PersistentVolumeClaim and patch the image registry operator to use it.
oc apply -f - <<YAML
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: image-registry-storage
namespace: openshift-image-registry
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
YAML
Patch the registry operator to use the PVC and set it to Managed.
oc patch configs.imageregistry.operator.openshift.io cluster \
--type merge --patch '{"spec":{"storage":{"pvc":{"claim":"image-registry-storage"}},"managementState":"Managed"}}'
Verify the registry pods are running.
$ oc get pods -n openshift-image-registry
NAME READY STATUS RESTARTS AGE
image-registry-5c8f7b6d4f-x2k9j 1/1 Running 0 2m
Day-2 Operations Overview
Once the cluster is running, there are several day-2 tasks to handle for a production-ready environment:
- Identity provider – Replace kubeadmin with an identity provider like LDAP, HTPasswd, or OpenID Connect. Run
oc delete secret kubeadmin -n kube-systemafter configuring an alternative admin user - Cluster certificates – Replace the default self-signed certificates with trusted TLS certificates for the API and wildcard ingress routes
- Monitoring and alerting – OpenShift ships with Prometheus and Alertmanager. Configure persistent storage for monitoring data and set up alert receivers
- Cluster logging – Deploy the OpenShift Logging Operator (based on Loki or Elasticsearch) to collect and store container logs
- Node scaling – Add or remove worker nodes as workload demand changes. In a KVM environment, this means creating or destroying VMs and approving CSRs
- Cluster updates – Use the OpenShift web console or
oc adm upgradeto apply platform updates. Test upgrades in a staging environment first - Backup and disaster recovery – Back up etcd regularly using
etcdctl snapshot savefrom a master node. Store backups off-cluster
Conclusion
You now have a fully functional OpenShift 4.17 cluster running on KVM/libvirt with three control plane nodes and three worker nodes. The bastion VM provides all supporting infrastructure – DNS, DHCP, load balancing, and PXE boot services. For production use, replace the single HAProxy instance with a highly available load balancer, use dedicated DNS infrastructure, add proper TLS certificates, and configure an external storage solution like Ceph or GlusterFS for persistent storage.
Related Guides
- Red Hat OpenShift 4 New Features
- How To Install Harbor Registry on Kubernetes / OpenShift
- Install Project Quay Registry on OpenShift With Operator
- Configure Master / Slave BIND DNS on CentOS 8 / RHEL 8
- Prevent Users from Creating Projects in OpenShift / OKD Cluster






































































First of all, I’m new to this openshift environment but this document has given me a clear picture of ocp4 installaion. Thank you so much for sharing this document and it helps me a lot 🙂
Thanks for the positive comment.
thanks for the detailed installation setup.
Web console is not accessible for me
Hi Josphat,
first thank you for the tutorial learn alot. Iam actually having issue with dns, after I run ansible, I stoped in this bug as below:
[root@fedora ocp4_ansible]# nmcli connection up enp1s0
Error: Connection activation failed: IP configuration could not be reserved (no available address, timeout, etc.)
Hint: use ‘journalctl -xe NM_CONNECTION=3771b159-c3a1-4e68-9b93-ea8150ee0ec5 + NM_DEVICE=enp1s0’ to get more details.
is there any step I need to do on the bastion server or on my main fedora server?
I apreaiate your support on this.
Hi,
This seems to be an issue with your DHCP server assigning addresses. Please validate configs on DHCP server – address pool and if working.
iam using the same subnet like yours.
so Iam really not sure how to get it work, really trying for 4 days with this issue. I appreciate your support Josphat.
————————————————
[root@fedora etc]# systemctl status named
● named.service – Berkeley Internet Name Domain (DNS)
Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor pre>
Active: active (running) since Fri 2021-12-24 22:20:30 EST; 9h ago
Process: 2903 ExecStartPre=/bin/bash -c if [ ! “$DISABLE_ZONE_CHECKING” == >
Process: 2905 ExecStart=/usr/sbin/named -u named -c ${NAMEDCONF} $OPTIONS (>
Main PID: 2906 (named)
Tasks: 8 (limit: 4668)
Memory: 24.6M
CPU: 163ms
CGroup: /system.slice/named.service
└─2906 /usr/sbin/named -u named -c /etc/named.conf
Dec 24 22:20:30 fedora named[2906]: zone ocp4.example.com/IN: loaded serial 202>
Dec 24 22:20:30 fedora named[2906]: zone localhost.localdomain/IN: loaded seria>
Dec 24 22:20:30 fedora named[2906]: all zones loaded
Dec 24 22:20:30 fedora systemd[1]: Started Berkeley Internet Name Domain (DNS).
Dec 24 22:20:30 fedora named[2906]: running
Dec 24 22:20:30 fedora named[2906]: managed-keys-zone: Key 20326 for zone . is >
Dec 24 22:31:38 fedora named[2906]: no longer listening on 192.168.100.254#53
Dec 24 22:32:24 fedora named[2906]: listening on IPv4 interface enp1s0, 192.168>
Dec 24 22:35:46 fedora named[2906]: no longer listening on 192.168.100.254#53
Dec 24 22:36:32 fedora named[2906]: listening on IPv4 interface enp1s0, 192.168
——————————–
[root@fedora dhcp]# cat dhcpd.conf
authoritative;
ddns-update-style interim;
default-lease-time 14400;
max-lease-time 14400;
allow booting;
allow bootp;
option routers 192.168.100.1;
option broadcast-address 192.168.100.255;
option subnet-mask 255.255.255.0;
option domain-name-servers 192.168.100.254;
option ntp-servers time.google.com;
option domain-name “ocp4.example.com”;
option domain-search “ocp4.example.com”, “example.com”;
subnet 192.168.100.0 netmask 255.255.255.0 {
interface enp1s0;
pool {
range 192.168.100.10 192.168.100.50;
# Static entries
host bootstrap { hardware ethernet 52:54:00:a4:db:5f; fixed-address 192.168.100.10; }
host master01 { hardware ethernet 52:54:00:8b:a1:17; fixed-address 192.168.100.11; }
host master02 { hardware ethernet 52:54:00:ea:8b:9d; fixed-address 192.168.100.12; }
host master03 { hardware ethernet 52:54:00:f8:87:c7; fixed-address 192.168.100.13; }
host worker01 { hardware ethernet 52:54:00:31:4a:39; fixed-address 192.168.100.21; }
host worker02 { hardware ethernet 52:54:00:6a:37:32; fixed-address 192.168.100.22; }
host worker03 { hardware ethernet 52:54:00:95:d4:ed; fixed-address 192.168.100.23; }
# this will not give out addresses to hosts not listed above
deny unknown-clients;
# this is PXE specific
filename “pxelinux.0”;
next-server 192.168.100.254;
}
}
I have problem with bastion/helper MAC addresse,
[root@fedora ocp4_ansible]# systemctl status dhcpd
● dhcpd.service – DHCPv4 Server Daemon
Loaded: loaded (/usr/lib/systemd/system/dhcpd.service; enabled; vendor pre>
Active: active (running) since Sat 2021-12-25 08:56:12 EST; 3min 37s ago
Docs: man:dhcpd(8)
man:dhcpd.conf(5)
Main PID: 4423 (dhcpd)
Status: “Dispatching packets…”
Tasks: 1 (limit: 4668)
Memory: 9.7M
CPU: 9ms
CGroup: /system.slice/dhcpd.service
└─4423 /usr/sbin/dhcpd -f -cf /etc/dhcp/dhcpd.conf -user dhcpd -gr>
Dec 25 08:56:12 fedora dhcpd[4423]: Listening on LPF/enp1s0/52:54:00:bd:0e:99/1>
Dec 25 08:56:12 fedora dhcpd[4423]: Sending on LPF/enp1s0/52:54:00:bd:0e:99/1>
Dec 25 08:56:12 fedora dhcpd[4423]: Sending on Socket/fallback/fallback-net
Dec 25 08:56:12 fedora dhcpd[4423]: Server starting service.
Dec 25 08:56:12 fedora systemd[1]: Started DHCPv4 Server Daemon.
Dec 25 08:58:48 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
Dec 25 08:58:50 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
Dec 25 08:58:55 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
Dec 25 08:59:03 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
Dec 25 08:59:20 fedora dhcpd[4423]: DHCPDISCOVER from 52:54:00:bd:0e:99 via enp>
you know what I will do it manually trhough the UI Virt-manager, but I like the way you did it through the cli but not save 🙂
yes it worked, with virt-manager excellent through the UI and managing the images. Thank you alot for this amazing deep dive into openshift preparing the prerequisites.
I don’t see a response from writer on how you fixed this. I am having same issue.
After installing openshift 4.9 with your howto I cannot reach the webconsole. When checking the logs for the console pod:
0128 13:24:18.253509 1 auth.go:231] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp4.lagrange.com/oauth/token failed: Head “https://oauth-openshift.apps.ocp4.lagrange.com”: EOF
E0128 13:24:28.293887 1 auth.go:231] error contacting auth provider (retrying in 10s): request to OAuth issuer endpoint https://oauth-openshift.apps.ocp4.lagrange.com/oauth/token failed: Head “https://oauth-openshift.apps.ocp4.lagrange.com”: EOF
What is wrong?
I’m running into an error with the workers and masters timing out trying to get to the ignition scripts; there’s a timeout trying to reach https://api-int, which is listening. I created a self-signed cert and added that to the haproxy config, but am still seeing the tcp dial io timeout.
Any ideas?
I’d submitted a comment about having a timeout / possible ssl issue, but that was entirely my networking. Working now. This is an awesome step-by-step, and it has given me a lot of valuable stick time on this openshift nonsense. I genuinely appreciate what you’ve done here. Thanks very much!
Thank you Jason.
Hi Jason,
Can you please let me know ,how your issue resolved.I have faced the same timeout issue in my environment.
[root@api ~]# oc get co authentication
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.9.19 False False True 7m7s OAuthServerRouteEndpointAccessibleControllerAvailable: route.route.openshift.io “oauth-openshift” not found
OAuthServerServiceEndpointAccessibleControllerAvailable: Get “https://172.30.21.224:443/healthz”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
OAuthServerServiceEndpointsEndpointAccessibleControllerAvailable: endpoints “oauth-openshift” not found
ReadyIngressNodesAvailable: Authentication requires functional ingress which requires at least one schedulable and ready node. Got 0 worker nodes, 1 master nodes, 0 custom target nodes (none are schedulable or ready for ingress pods).
WellKnownAvailable: The well-known endpoint is not yet available: failed to get oauth metadata from openshift-config-managed/oauth-openshift ConfigMap: configmap “oauth-openshift” not found (check authentication operator, it is supposed to create this)
any idea?
Do you have an active worker node in your cluster?
Hi, Josphat,
Thank you for your very useful blog.
In section: Download Pull Secret
You mention:
Visit cloud.redhat.com and select “Bare Metal” then “UPI“.
I have not idea how to get those secret from this description. There si no “Bare Metal” in the page of cloud.redhat.com.
Could you help on that?
Thanks
H Ken. Use the link https://console.redhat.com/openshift/install/pull-secret
Hi Josphat,
I’ve followed your guide to the letter. It’s been incredibly educational and I learned a lot on how OCP operates. What I’m completely stuck on is accessing the Openshift console. Is there a particular forwarding piece I’m missing here? How is everyone else doing it?
I’ve been running this on KVM on RHEL8.x. How can I access the openshift console externally?
Thanks
We’re delighted at least you got the setup up and running. Access to console is through HAProxy LB and mapped DNS name.
You can get console address by running the commands:
If your DNS is working ok you should be able to access OpenShift Console.
I’m still a bit lost here. If I’m logged into the bastion vm, all this works fine. OC commands work, etc. I can even login to the cluster from the KVM host itself. What I can’t seem to understand is how to get to the openshift console from outside the KVM host. The KVM host lives in a datacenter and is just running RHEL 8 server with no desktop. HAProxy works fine on the KVM host as it can get to all the services on the cluster running internally.
DNS is working inside the cluster. The nodes can all resolve each other no problem. But the ip range is private inside.
I feel like I’m missing a step here. I reviewed and followed your directions above multiple times, but sitting here at my laptop, I cannot open the console. Does the DNS need to be resolvable outside the KVM host as well? Are there ports that need to be forwarded on the KVM host?
I had the same issue. It is possible the tutorial assumes the hypervisor host has a gui and just dropped to the terminal?
In any case, I spun up a VM with a GUI with both openshift and default networks and am finally at the web console.
Hi Jospat, Amazing work. Works like a charm. I knew Kuberenetes and all but OCP installation I could not find a clear instructoin on how to install. Your guide was the guiding light for me. Thanks a Ton. I would like to buy you 10 coffees. Please DM me.
The TFTP files have changed to the below names, the wget doesn’t work anymore. Great blog though Josphat:
rhcos-live-initramfs.x86_64.img
rhcos-live-kernel-x86_64
rhcos-live-rootfs.x86_64.img
The files are available here: https://console.redhat.com/openshift/install/metal/user-provisioned
Thanks for the tip. We’ve updated the guide accordingly.
I installed this, but when I try to validate things, “oc whoami”, I get the following error. DNS seems to be picking it up, but not sure where else to look: (newbie)
oc whoami
Unable to connect to the server: dial tcp: lookup api.ocp4.nsidedabox.local: no such host
[phillip@fedora ~]$ nslookup api.ocp4.nsidedabox.local
Server: 192.168.4.5
Address: 192.168.4.5#53
Name: api.ocp4.nsidedabox.local
Address: 192.168.100.254
While I have deployed well over 100 OCP clusters starting at 4.2 with the latest at 4.11.0 I have a need to build an OCP 4.10 cluster on KVM, and that is a platform Red Hat only has documentation for with LinuxONE. So your guide is very valuable and much appreciated. I though am curious if you have a similar guide using RHEL or CentOS 8 worker nodes, in place of RHCOS? My next demo cluster needs to have RHEL or CentOS worker nodes and I’d like to run on KVM if possible. Thanks
At the moment we don’t have one. We’ll play with it and hopefully write about. Thank you.
My cluster works, however when I try to login using “oc login -u ” I get error: x509: certificate signed by unknown authority all the time.
What can I do to mitigate this?
Which browser?, Did you try Firefox?
Its not in browser its on the command line when I try
oc login -u …….
When I create the Bootstrap Virtual Machine using command below:
sudo virt-install -n bootstrap.ocp4.example.com \
–description “Bootstrap Machine for Openshift 4 Cluster” \
–ram=8192 \
–vcpus=4 \
–os-type=Linux \
–os-variant=rhel8.0 \
–noreboot \
–disk pool=default,bus=virtio,size=50 \
–graphics none \
–serial pty \
–console pty \
–pxe \
–network bridge=openshift4,mac=52:54:00:a4:db:5f
It stucks at below:
[** ] A start job is running for Acquire …ootfs image (51min 53s / no limit)[ 3118.047630] coreos-livepxe-rootfs[865]: curl: (7) Failed to connect to 192.168.100.254 port 8080: Connection timed out
[ 3118.052072] coreos-livepxe-rootfs[805]: Couldn’t establish connectivity with the server specified by:
[ 3118.055647] coreos-livepxe-rootfs[805]: coreos.live.rootfs_url=http://192.168.100.254:8080/rhcos/rootfs.img
[ 3118.059331] coreos-livepxe-rootfs[805]: Retrying in 5s…
[ * coreos-livepxe-rootfs[867]: curl: (7) Failed to connect to 192.168.100.254 port 8080: Connection timed out
Can you shed a light on this?
hi Josphat Mutai
can you advise on my problem when firing up the master node :
GET error : Get “https://api-int.ocp4.example.com:22623/config/master”
I can do telnet to the port successfully from bastion host, so it means ther service is working fine on port 22623….but when i try to access the specific url using curl…..this is the error i got :
OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api-int.ocp4.example.com:22623
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to api-int.ocp4.example.com:22623
please advise.
Arry, can you please tell me how were you able to resolve the issue?
Testtttt !!!!!
I don’t have much resource on my machine, is that possible to install only one master and one worker node, and it will still work?
When I tried creating bootstrap machine in virsh
sudo virt-install -n bootstrap.ocp4.example.com –description “Bootstrap Machine for Openshift 4 Cluster” –ram=4096 –vcpus=4 –os-type=Linux –os-variant=rhel8.4 –noreboot –disk pool=default,bus=virtio,size=50 –graphics none –serial pty –console pty –pxe –network bridge=openshift4,mac=52:54:00:a4:db:5f
The issue I got is,
[FAILED] Failed to start Rebuild Journal Catalog.
See ‘systemctl status systemd-journal-catalog-update.service’ for details.
:
:
:
[ OK ] Stopped Update is Completed.
[ OK ] Stopped Rebuild Hardware Database.
[ OK ] Stopped Rebuild Dynamic Linker Cache.
[ OK ] Stopped CoreOS: Set printk To Level 4 (warn).
[ OK ] Stopped target Local Encrypted Volumes.
[ OK ] Stopped target Local Encrypted Volumes (Pre).
[ OK ] Stopped Dispatch Password Requests to Console Directory Watch.
[ OK ] Stopped Load Kernel Modules.
[ OK ] Stopped Load/Save Random Seed.
[ OK ] Stopped Update UTMP about System Boot/Shutdown.
Unmounting var.mount…
[ OK ] Stopped Create Volatile Files and Directories.
[ OK ] Stopped target Local File Systems.
Unmounting Temporary Directory (/tmp)…
Unmounting /run/ephemeral…
Unmounting /etc…
[FAILED] Failed unmounting var.mount.
[FAILED] Failed unmounting /etc.
[ OK ] Unmounted Temporary Directory (/tmp).
[ OK ] Stopped target Swap.
[ OK ] Unmounted /run/ephemeral.
[ OK ] Reached target Unmount All Filesystems.
[ OK ] Stopped target Local File Systems (Pre).
[ OK ] Stopped Create Static Device Nodes in /dev.
[ OK ] Stopped Create System Users.
Stopping Monitoring of LVM2 mirrors…ng dmeventd or progress polling…
[ OK ] Stopped Monitoring of LVM2 mirrors,…sing dmeventd or progress polling.
[ OK ] Reached target Shutdown.
[ OK ] Reached target Final Step.
[ OK ] Started Reboot.
[ OK ] Reached target Reboot.
[ 68.354526] watchdog: watchdog0: watchdog did not stop!
[ 68.458591] reboot: Restarting system
Why I’m getting this error
[FAILED] Failed to start Rebuild Journal Catalog.
[FAILED] Failed unmounting var.mount.
[FAILED] Failed unmounting /etc.
Same here, and it then prevents the automation of the whole cluster. I should mention that I use a nested virtualization.
Hello Josphat,
thanks for the step by step installation guide which is well defined !
Could you please tell me that in order to add additional worker after cluster initiated which steps I need to use ? Is there anyway to use existing yaml files to add worker to existing cluster ?
Thank you,
Leo
Similar steps in bootstrapping the initial nodes as covered in the article.
Josphat,
Great article. I have been trying for 2 months to do this. It’s fairly complex. I now have a running cluster and i’m starting to play with it. I used fedora38 without any problems. I used Network Manager so I had to change the bastion network setup. Everything else ran pretty much without a hitch. Thank you for doing this.
Brian
Nice happy for you Brian that you did manage to bring up the cluster.
Hi Josphat,
your post was very helpful for us to change our existing ocp installation tool. We used a old “helpernodectl binary” and ocp 4.13 didn’t got installed with this tool.
With you block we were be able to remove the helpernodectl binary tool and implement all steps qith ansibl. We have now one Ansible playbook that runs on KVM and VMware.
I have only one open question to the step with the TFTP service.
Why do we need to create this “start-tftp.sh” script ?
What will be happen if we don’t have this script ?
You may can give us a short explanation !
I can sy create job ! Thank you
start-tftp.sh is not must create.
Current openshift-installer from that wget is 4.13.9 and when i try to create manifest its missing the manifests/04-openshift-machine-config-operator.yaml file. So does procedure still apply to version later than 4.13.4 which is used in above example ? i did continue and built bootstrap without error but it cannot start kubelet.service and ‘get pods’ fails with
“E0821 11:22:35.502291 6321 memcache.go:238] couldn’t get current server API group list: Get “https://api.ocp4.example.com:6443/api?timeout=32s”: EOF
Unable to connect to the server: EOF”
I can see the port 6443 is in LISTEN mode so wondering if something changed? Any help please?
We’ve dated the article against the latest release of OpenShift and it works.
Hi. Excellent insight. Thank you!
Has anyone ever stumbled of the error below ? The bootstrapper seems to be working fine but it seems none of the masters are able to join. The LAB is running on fedora38;poweredge;256GB RAM.
Aug 30 18:30:02 master02.ocp4.example.com kubenswrapper[2338]: E0830 18:30:02.176004 2338 kubelet.go:2495] “Container runtime network not ready” networkReady=”NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?”
I have the same issue – I would apperciate help with this one please.
Seem deployment not yet complete. How fast is your internet?, also ssh into one of the control plane node and run
journalctl -fto see what’s happening.have your resolved? i have the same problem..
I think that this error is just a result of some previous errors. The operator controllers are not fully comming up. Internet is fast. So ist the mem/cpu power. Will give it some more shots…
The `openshift-install` is not showing kvm platform, any idea how to choose the platform ?
““
$ openshift-install create ignition-configs
? SSH Public Key /root/.ssh/id_rsa.pub
? Platform [Use arrows to move, type to filter, ? for more help]
gcp
ibmcloud
nutanix
openstack
> ovirt
powervs
vsphere
Thank you for the helpful article. By using it, I was able to successfully stand up an OpenShift 4.17 libvirt VM cluster. However, I had to modify the `virt-install` command for creating the control node/s. The command uses the argument `–ram=8192`. However, this does not match the “preferred” amount of memory (16GB) specified in the beginning of the article.
I was having a problem during the installation where my worker nodes were not coming up. When I checked resources on the control node, output showed that memory use was close to 100% when trying to bring up worker nodes.
“`
oc describe node master01.ocp4.local
“`
“`
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
——– ——– ——
cpu 1306m (37%) 10m (0%)
memory 6537Mi (95%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
“`
When I destroyed and undefined the control node and created it with `–ram=16384`, the three worker nodes successfully came up. Checking the control node’s resources later, more memory than the originally allocated 8GB is needed:
“`
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
——– ——– ——
cpu 2236m (29%) 10m (0%)
memory 9501Mi (63%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
“`
hmm so only with 16GB was the setup successful?
For me, yes.