With the technological advancements, we can all agree that the containerization technology has been highly adopted. This is due to the fact that most businesses around the world have opted to modernize their applications for cloud use. This has brought tools like docker, Podman, Kubernetes e.t.c into play.

Kubernetes is a free and open-source container orchestration tool that have been highly acquired in the past decade. Kubernetes allows users to run applications over a cluster of hosts while offering other benefits such as autoscaling, automatic bin packing, self-healing, service discovery, storage orchestration, automated rollouts/rollbacks and load balancing e.t.c.

Why OKD?

Have you ever wondered if there is an open-source OpenShift version? Yes there is. OKD also referred to as Origin is the upstream and community-supported version of the Red Hat OpenShift Container Platform (OCP). It is optimized for continuous application development and also serves as the upstream code base upon which Red Hat OpenShift Online and Red Hat OpenShift Container Platform are built. Due to this, OKD adds a number of developer and operations-centric tools on top of Kubernetes to enable rapid application development, easy deployment and scaling, and long-term lifecycle maintenance for small and large teams.

The amazing features associated with OKD are:

  • Free and open-source: all contributions to the project are accepted. OKD uses the Apache 2 license and does not require any contributor agreement to submit patches
  • Offers monitoring and alerting: with OKD, you can efficiently perform systems and service monitoring using a number of tools such Prometheus, Grafana e.t.c
  • LDAP/Active Directory authentication: You can authenticate to the services in your cluster using LDAP/Active directory users.
  • Easy-to-use web interface: OKD provides an easy-to-use interface that provides an overview of resources in your cluster. You can also manage the resources from this dashboard.

In this guide, we will learn how to set up a Multi-Node OKD 4 Cluster.

Lab Environment setup

In this guide, we will have 6 nodes provisioned as shown:

TASKHOSTNAMEIP ADDRESS MEMORYvCPUSTORAGE
Manager Node
(DNS, HAProxy)
CentOS-Stream-8
apps192.168.205.164GB450GB
Bootstrap Node
Fedora CoreOS
okd4-bootstrap192.168.205.2916GB450GB
Control Node 1
Fedora CoreOS
okd4-control-plane-1192.168.205.3016GB450GB
Control Node 2
Fedora CoreOS
okd4-control-plane-2192.168.205.3116GB450GB
Control Node 3
Fedora CoreOS
okd4-control-plane-3192.168.205.3216GB450GB
Compute Node 1
Fedora CoreOS
okd4-compute-1192.168.205.3316GB450GB

Step 1 – Install and Configure Dnsmasq service

We will begin by setting up Dnsmasq on the manager node to resolve the requests for the Kubernetes cluster. First, disable systemd-resolve on port 53 which may conflict with the DNSMasq service:

sudo -i
systemctl disable systemd-resolved
systemctl stop systemd-resolved
sudo killall -9 dnsmasq

Proceed and remove the resolv.conf symbolic link

# ls -lh /etc/resolv.conf 
lrwxrwxrwx 1 root root 39 Aug  8 15:52 /etc/resolv.conf -> ../run/systemd/resolve/stub-resolv.conf

# unlink /etc/resolv.conf

Now create a new resolv.conf file and set your DNS server manually.

# vim /etc/resolv.conf
nameserver 8.8.8.8

Now install the Dnsmasq package on CentOS Stream 8

dnf -y install dnsmasq

Once installed, proceed and configure it:

mv /etc/dnsmasq.conf  /etc/dnsmasq.conf.bak
vim /etc/dnsmasq.conf

In the file, add the below lines

port=53
domain-needed
bogus-priv

strict-order
expand-hosts
domain=okd.computingforgeeks.com
address=/apps.okd.computingforgeeks.com/192.168.205.16

In this case, my domain name is computingforgeeks.com and 192.168.205.16 resolves all the requests directed to apps.okd.computingforgeeks.com. Once the desired changes have been made, save the file and restart the service:

systemctl restart dnsmasq

Now add the DNS records to /etc/hosts. Dnsmasq will reply to all requests using the records here.

# vim /etc/hosts
192.168.205.16  api-int.okd.computingforgeeks.com api
192.168.205.29  okd4-bootstrap.okd.computingforgeeks.com          okd4-bootstrap
192.168.205.30  okd4-control-plane-1.okd.computingforgeeks.com    okd4-control-plane-1
192.168.205.31  okd4-control-plane-2.okd.computingforgeeks.com    okd4-control-plane-2
192.168.205.32  okd4-control-plane-3.okd.computingforgeeks.com    okd4-control-plane-3
192.168.205.33  okd4-compute-1.okd.computingforgeeks.com          okd4-compute-1

Also modify /etc/resolv.conf:

# vim /etc/resolv.conf
nameserver 127.0.0.1
nameserver 8.8.8.8

Restart the service to apply the chnages:

systemctl restart dnsmasq

Allow te port through the firewall:

firewall-cmd --permanent --add-port=53/udp
firewall-cmd --reload

Test if the service is working as desired:

Multi Node OKD 4 Cluster

Step 2 – Install and Configure HAProxy

HAProxy will be used in this setup as a Loadbalancer between the hosts in the cluster. Install HAProxy on CentOS Stream 8 with the command:

dnf install haproxy git -y

Now configure it using the OKD files downloaded as shown

cd
git clone https://github.com/cragr/okd4_files.git
cd okd4_files

Now modify the HAProxy file as desired.

vim haproxy.cfg

In the file, adjust your hosts accordingly

.....
backend okd4_k8s_api_be
    balance source
    mode tcp
    server      okd4-bootstrap 192.168.205.29:6443 check
    server      okd4-control-plane-1 192.168.205.30:6443 check
    server      okd4-control-plane-2 192.168.205.31:6443 check
    server      okd4-control-plane-3 192.168.205.32:6443 check

frontend okd4_machine_config_server_fe
    bind :22623
    default_backend okd4_machine_config_server_be
    mode tcp
    option tcplog

backend okd4_machine_config_server_be
    balance source
    mode tcp
    server      okd4-bootstrap 192.168.205.29:22623 check
    server      okd4-control-plane-1 192.168.205.30:22623 check
    server      okd4-control-plane-2 192.168.205.31:22623 check
    server      okd4-control-plane-3 192.168.205.32:22623 check

frontend okd4_http_ingress_traffic_fe
    bind :80
    default_backend okd4_http_ingress_traffic_be
    mode tcp
    option tcplog

backend okd4_http_ingress_traffic_be
    balance source
    mode tcp
    server      okd4-compute-1 192.168.205.33:80 check
#    server      okd4-compute-2 192.168.205.34:80 check

frontend okd4_https_ingress_traffic_fe
    bind *:443
    default_backend okd4_https_ingress_traffic_be
    mode tcp
    option tcplog

backend okd4_https_ingress_traffic_be
    balance source
    mode tcp
    server      okd4-compute-1 192.168.205.33:443 check
#    server      okd4-compute-2 192.168.205.34:443 check

Save the file copy it to the config directory:

sudo mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
cp haproxy.cfg /etc/haproxy/haproxy.cfg

Modify SELinux:

setsebool -P haproxy_connect_any 1
setsebool -P httpd_can_network_connect on
setsebool -P httpd_graceful_shutdown on
setsebool -P httpd_can_network_relay on
setsebool -P nis_enabled on
semanage port -a -t http_port_t -p tcp 6443
semanage port -a -t http_port_t -p tcp 22623
semanage port -a -t http_port_t -p tcp 1936

Start and enable the service;

systemctl enable haproxy
systemctl start haproxy

Veify if the service is running:

# systemctl status haproxy
● haproxy.service - HAProxy Load Balancer
   Loaded: loaded (/usr/lib/systemd/system/haproxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2022-11-09 09:45:11 EST; 6s ago
  Process: 33433 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -f $CFGDIR -c -q $OPTIONS (code=exited, status=0/SUCCESS)
 Main PID: 33436 (haproxy)
    Tasks: 2 (limit: 29496)
   Memory: 4.8M
   CGroup: /system.slice/haproxy.service
           ├─33436 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -f /etc/haproxy/conf.d -p /run/haproxy.pid
           └─33438 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -f /etc/haproxy/conf.d -p /run/haproxy.pid

Allow the OKD ports through the firewall:

firewall-cmd --add-port={6443/tcp,22623/tcp,1936/tcp}
firewall-cmd --add-service={http,https}
firewall-cmd --runtime-to-permanent
firewall-cmd --reload

Step 3 – Configure Apache web server to host files

In this guide, we will use Apache to serve the ignition files during the Fedora CoreOS installation. Install it with the command:

dnf install -y httpd

Change the listen port to 8080

sed -i 's/Listen 80/Listen 8080/' /etc/httpd/conf/httpd.conf

Also modify SELinux:

sudo setsebool -P httpd_read_user_content 1

Start andenable the service:

systemctl enable httpd
systemctl start httpd

Allow the port through the firewall:

firewall-cmd --permanent --add-port=8080/tcp
firewall-cmd --reload

Verify if the server is up:

curl localhost:8080

Step 4 – Install openshift-installer and oc client

Now to be able to install and manage OKD, we need the openshift-installer and oc client. Download the latest availabe versions from the OKD releases page.

It is also possible to pull the files using wget as shown:

cd ~/
VER=$(curl -s https://api.github.com/repos/okd-project/okd/releases/latest|grep tag_name|cut -d '"' -f 4|sed 's/v//')
wget https://github.com/okd-project/okd/releases/download/${VER}/openshift-client-linux-${VER}.tar.gz
wget https://github.com/okd-project/okd/releases/download/${VER}/openshift-install-linux-${VER}.tar.gz

Extract the downloaded fils:

tar -zxvf openshift-client-linux-*.tar.gz
tar -zxvf openshift-install-linux-*.tar.gz

Move the binaries to your PATH:

sudo mv kubectl oc openshift-install /usr/local/bin/

Export the PATH:

echo "export PATH=\$PATH:/usr/local/bin" | sudo tee -a /etc/profile
source /etc/profile

Verify the installation:

# oc version
Client Version: 4.14.0-0.okd-2023-10-28-073550
Kustomize Version: v5.0.1

# openshift-install version
openshift-install 4.14.0-0.okd-2023-10-28-073550
built from commit 03546e550ae68f6b36d78d78b539450e66b5f6c2
release image quay.io/openshift/okd@sha256:7a6200e347a1b857e47f2ab0735eb1303af7d796a847d79ef9706f217cd12f5c
release architecture amd64

# kubectl version --short --client
Client Version: v1.27.4
Kustomize Version: v5.0.1

Step 5 – Setup the openshift-installer

We need to make the configurations to the install-config.yaml. In this file, you need to make adjustments for the SSH and pull-secret from RedHat. First, generate the SSH keys:

# ssh-keygen -q -N ""
Enter file in which to save the key (/root/.ssh/id_rsa): 

Create an install directory and copy the file:

cd
mkdir install_dir
cp okd4_files/install-config.yaml ./install_dir

Now modify the configuration file.

vim ./install_dir/install-config.yaml

Replace the required items:

apiVersion: v1
baseDomain: computingforgeeks.com
metadata:
  name: okd

compute:
- hyperthreading: Enabled
  name: worker
  replicas: 0

controlPlane:
  hyperthreading: Enabled
  name: master
  replicas: 3

networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16

platform:
  none: {}

fips: false

pullSecret: 'PASTE-PULL-SECRET-HERE'
sshKey: 'PASTE-YOUR-SSH-PUBKEY-HERE'

In the file,replace:

  • baseDomain : specify the base domain name
  • metadata.name.(baseDomain): is the same one as the name on DNSMasq
  • controlPlane.replicas : specify the number of Control Plane Nodes
  • pullSecret : paste the contents of the Pull Secret you obtained
  • sshKey : paste contents of the SSH key you generated above (public key)

Save the file and take a backup:

cp ./install_dir/install-config.yaml ./install_dir/install-config.yaml.bak

Now generate the manifest to be used by Kubernetes:

# openshift-install create manifests --dir=install_dir/
INFO Consuming Install Config from target directory 
WARNING Making control-plane schedulable by setting MastersSchedulable to true for Scheduler cluster settings 
INFO Manifests created in: install_dir/manifests and install_dir/openshift

To prevent pods form being scheduled on the control plane machines modify the manifest:

sed -i 's/mastersSchedulable: true/mastersSchedulable: False/' install_dir/manifests/cluster-scheduler-02-config.yml

Also generate the ignition files for Fedora CoreOS installation:

# openshift-install create ignition-configs --dir=install_dir/
INFO Consuming Common Manifests from target directory 
INFO Consuming Master Machines from target directory 
INFO Consuming Openshift Manifests from target directory 
INFO Consuming OpenShift Install (Manifests) from target directory 
INFO Consuming Worker Machines from target directory 
INFO Ignition-Configs created in: install_dir and install_dir/auth

Now you need to upload the files to be hosted by the Apache web server:

mkdir /var/www/html/okd4
cp -R install_dir/* /var/www/html/okd4/
chown -R apache: /var/www/html/
chmod -R 755 /var/www/html/

Test if everything is okay:

# curl localhost:8080/okd4/metadata.json
{"clusterName":"okd","clusterID":"9c43ce0c-d081-4051-a118-36a201648034","infraID":"okd-6rflz"}

Step 6 – Download and Start the Fedora CoreOS Nodes

This OKD cluster will use the Fedora CoreOS for the bootstrap, control and compute nodes. To perform the installation, you need to download the Fedora CoreOS bare metal(ISO)image. This ISO image can be used to set up the hosts on any hypervisor such as Proxmox, VirtualBox, VMware e.t.c. considering the fact that they lie on the same network as the manager node.

This can also be pulled using wget as shown:

RELEASE="38.20231014.3.0"
wget https://builds.coreos.fedoraproject.org/prod/streams/stable/builds/$RELEASE/x86_64/fedora-coreos-$RELEASE-live.x86_64.iso

Now provision the bootstrap, control and compute nodes to meet the required specifications. Load the downloaded ISO image and boot them systematically as shown below.

a. Start the bootstrap node

The bootstrap Node is the first to be started. Allow the system to boot uninterruptedly to this stage:

Multi Node OKD 4 Cluster 1

Now from here, we need to configure our network. For this case, we will use the NMTUI tool. From the commandline, execute the command:

nmtui

Edit the connection.

Multi Node OKD 4 Cluster 2

Select the appropriate connection:

Multi Node OKD 4 Cluster 3

Now set a static IP and use the manager Node as the DNS server(you also provide 8.8.8.8) as the alternative DNS for your node.

Multi Node OKD 4 Cluster 4

Save the changes and activate the connection:

Multi Node OKD 4 Cluster 5

Here, deactivate and activate the connection.

Multi Node OKD 4 Cluster 6

You should now have a static IP address configured on your node. Verify this with the command:

ip a

Determine the disk name:

sudo fdisk -l

Sample Output:

Multi Node OKD 4 Cluster 7

Now initiate the installation using the hosted ignition file and copy the network configuration:

sudo coreos-installer install /dev/sda \
--ignition-url=http://192.168.205.16:8080/okd4/bootstrap.ign \
--insecure-ignition \
--copy-network

The instalation will happen as shown.

Multi Node OKD 4 Cluster 8

Once complete, reboot your node:

sudo reboot

b. Start the 3 control plane nodes

Now using the similar steps, allow the nodes to boot and set the static IPs, remember to provide the manager node as the DNS server and 8.8.8.8 as the alternative DNS.

Multi Node OKD 4 Cluster 9

Proceed and activate the connection for the chages to apply.

Also here, identify the disks:

sudo fdisk -l

Now start the installation usng the correct ignition file:

sudo coreos-installer install /dev/sda \
--ignition-url=http://192.168.205.16:8080/okd4/master.ign \
--insecure-ignition \
--copy-network

Once the installation is complete, restart the nodes:

sudo reboot

c. Starting the compute nodes

Power on the compute nodes and set the static IP addresses exactly as we did for other nodes above. To perform the installation, use the command:

sudo coreos-installer install /dev/sda \
--ignition-url=http://192.168.205.16:8080/okd4/worker.ign \
--insecure-ignition \
--copy-network

Once complete, restart the nodes:

sudo reboot

d. Monitor the bootstrap installation

Now monitor the installtion form the manager node using the command:

openshift-install --dir=install_dir/ wait-for bootstrap-complete --log-level=info

This takes quite a while to complete just about 20 minutes. Once complete, you should see this.

Multi Node OKD 4 Cluster 10

Now you can modify your HAProxy config and disable the bootstrap node:

sed '/ okd4-bootstrap /s/^/#/' /etc/haproxy/haproxy.cfg
systemctl reload haproxy

You can also shutdown the bootstrap node since you nolonger need it. You can also SSH into the Fedora CoreOS nodes from the Manager Node:

ssh core@fcos-node-ip

e. Login to the cluster and approve CSRs

When viewing the availabe nodes, you will notice that the workers nodes will not be displayed.

export KUBECONFIG=~/install_dir/auth/kubeconfig
oc get nodes

Sample Output:

Multi Node OKD 4 Cluster 11

This is beacuse you need to approve the pending CSRs. To view them, use the command:

# oc get csr
NAME                                             AGE     SIGNERNAME                                    REQUESTOR                                                                         REQUESTEDDURATION   CONDITION
csr-2tdkf                                        11m     kubernetes.io/kubelet-serving                 system:node:okd4-control-plane-1.okd.computingforgeeks.com                        <none>              Approved,Issued
csr-42xkn                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
csr-4gt5d                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
csr-59z9c                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
csr-j8t5t                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
csr-jlmmp                                        11m     kubernetes.io/kubelet-serving                 system:node:okd4-control-plane-2.okd.computingforgeeks.com                        <none>              Approved,Issued
csr-jnmvm                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
csr-ldvwt                                        11m     kubernetes.io/kubelet-serving                 system:node:okd4-control-plane-3.okd.computingforgeeks.com                        <none>              Approved,Issued
csr-mkz82                                        100s    kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Pending
csr-xbp6c                                        12m     kubernetes.io/kube-apiserver-client-kubelet   system:serviceaccount:openshift-machine-config-operator:node-bootstrapper         <none>              Approved,Issued
system:openshift:openshift-authenticator-bq85w   9m29s   kubernetes.io/kube-apiserver-client           system:serviceaccount:openshift-authentication-operator:authentication-operator   <none>              Approved,Issued
system:openshift:openshift-monitoring-m7scj      7m48s   kubernetes.io/kube-apiserver-client           system:serviceaccount:openshift-monitoring:cluster-monitoring-operator            <none>              Approved,Issued

To approve these CSRs, we need to install the jq package, that makes it easy to approve multiple of them.

VER=$(curl -s https://api.github.com/repos/stedolan/jq/releases/latest|grep tag_name|cut -d '"' -f 4|sed 's/v//g')
wget -O jq https://github.com/stedolan/jq/releases/download/$VER/jq-linux64
chmod +x jq
sudo mv jq /usr/local/bin/
jq --version

Now approve the pending CSRs:

# oc get csr -ojson | jq -r '.items[] | select(.status == {} ) | .metadata.name' | xargs oc adm certificate approve
certificatesigningrequest.certificates.k8s.io/csr-mkz82 approved

After sometime the worker node should be availabe:

Multi Node OKD 4 Cluster 12

If all the nodes appear as shown above, you are set to proceed.

Step 7 – Access the OKD web Console

Now you can access several dashboards provided by OKD. To get all the availabe routes, use the command:

# oc get routes -A
NAMESPACE                  NAME              HOST/PORT                                                            PATH   SERVICES          PORT    TERMINATION            WILDCARD
openshift-authentication   oauth-openshift   oauth-openshift.apps.okd.computingforgeeks.com                              oauth-openshift   6443    passthrough/Redirect   None
openshift-console          console           console-openshift-console.apps.okd.computingforgeeks.com                    console           https   reencrypt/Redirect     None
openshift-console          downloads         downloads-openshift-console.apps.okd.computingforgeeks.com                  downloads         http    edge/Redirect          None
openshift-ingress-canary   canary            canary-openshift-ingress-canary.apps.okd.computingforgeeks.com              ingress-canary    8080    edge/Redirect          None

Now using the provided URLs, you can access the various dashboards. To access the web dashboard, use the provided URL on the manager node browser. For this case, the URL is https://console-openshift-console.apps.okd.computingforgeeks.com

Multi Node OKD 4 Cluster 13

Get the kubeadmin password:

cat ~/install_dir/auth/kubeadmin-password

You can now login using the kubeadmin user and password obtained. Once authenticated, you will see the below dashboard.

Multi Node OKD 4 Cluster 14

View the running pods.

Multi Node OKD 4 Cluster 15

Now proceed and run your desired workloads on your OKD cluster.

The end!. For easy storage configurations check out:

Verdict

That marks the end of this detailed guide on how to set up a Multi-Node OKD 4 Cluster. I hope this was informative. There are many other guides on how to set up a Kubernetes cluster on this page:

3 COMMENTS

  1. thanks klinsmann, I installed it as you showed, but when I create a new project and create an nginx pod, I get the ImagePullBackOff error.

    • Thanks for your comment @Brian
      The choice between balance source and balance roundrobin depends on the specific requirements of your application and whether session persistence is a critical factor. Here’s why:
      1. balance source: directs requests from the same source IP address to the same backend server, ensuring session persistence for clients. If a client with the same source IP sends multiple requests, they will be directed to the same server. It is used for applications that require session persistence, where a user’s session data needs to be maintained on a specific server for the duration of their session
      2. balance roundrobin: distributes requests evenly in a circular manner to all available backend servers, regardless of the client’s source IP. It is useful for stateless applications where each request can be processed independently and there’s no need for session persistence.

LEAVE A REPLY

Please enter your comment!
Please enter your name here