Use Nginx as Kubernetes API Server Load Balancer [Guide]

An HA Kubernetes cluster needs a load balancer sitting in front of the control plane so that kubectl and every kubelet can reach the API on a single endpoint even when one or two control plane nodes are offline. Cloud-managed clusters hand you that load balancer. On-prem, you build it. Nginx in TCP (stream) mode is the workhorse choice: small footprint, deterministic, zero external dependencies, and it speaks Layer 4 so TLS stays end-to-end between client and kube-apiserver.

Original content from computingforgeeks.com - post 16446

This guide walks through the full setup on Rocky Linux / AlmaLinux and on Ubuntu / Debian: installing nginx from the official upstream repo, enabling the stream module, writing a production-grade TCP load-balancer config for kube-apiserver on port 6443, opening firewalld or UFW, labeling the port for SELinux, and verifying the LB is passing traffic. It closes with the keepalived + VIP pattern you need to remove the nginx host itself as a single point of failure.

Tested April 2026 on Rocky Linux 10.1 (kernel 6.12), nginx 1.30.0 from the official nginx.org RHEL repo, SELinux enforcing. Same config applies on Rocky/AlmaLinux 9, Ubuntu 24.04, and Debian 13.

Why Layer 4 for kube-apiserver

kube-apiserver speaks mTLS on port 6443. Clients (kubectl, kubelet, controller-manager, scheduler, every in-cluster component using the service account token) present certificates or bearer tokens over TLS, and the server presents its own certificate. A Layer 7 load balancer would need to terminate that TLS, which means re-issuing certificates signed by the cluster CA and breaking the chain. That is why every serious kubeadm HA topology puts a Layer 4 TCP proxy in front of 6443, not an HTTP reverse proxy. Nginx stream, HAProxy in mode tcp, and cloud Network Load Balancers all fit this shape.

The topology we are building:

           kubectl, kubelets, internal components
                         |
                         v  (TCP/6443, TLS end-to-end)
                  +------+------+
                  |  nginx LB   |   (this host)
                  +------+------+
                         |
         +---------------+---------------+
         v               v               v
   cp01:6443        cp02:6443       cp03:6443
   (kube-apiserver) (kube-apiserver) (kube-apiserver)

One nginx in front of three API servers gives you tolerance for two simultaneous control plane failures. It does not yet tolerate the loss of the nginx host itself. That is what keepalived handles, covered at the end.

Prerequisites

A dedicated host (VM or bare metal) for the load balancer, or a pair for HA
Network reachability from the LB host to every control plane node on port 6443/tcp
Rocky Linux 9/10, AlmaLinux 9/10, Ubuntu 22.04/24.04, or Debian 12/13
Root or sudo access on the LB host
DNS: an A record (for example k8s-api.example.com) pointing at the LB IP. Do this before running kubeadm init and use the name as --control-plane-endpoint. Pointing clients at an IP makes changing the LB later much harder.

Install nginx from the official repo

The distro-packaged nginx is fine for HTTP, but this is the one case where the official nginx.org build matters: it is compiled with the stream module baked in, and every release since 1.25 has it statically linked. On Rocky 10, the AppStream package is 1.26 and also includes stream, but we will use the official repo here for a predictable, current baseline across all supported distros.

On Rocky Linux / AlmaLinux / CentOS Stream

Disable the AppStream module so it does not fight the upstream repo during updates:

sudo dnf module reset nginx -y
sudo dnf module disable nginx -y

Drop in the official nginx repo. The $releasever variable resolves to 9 or 10 depending on the host:

sudo tee /etc/yum.repos.d/nginx.repo <<'EOF'
[nginx-stable]
name=nginx stable repo
baseurl=http://nginx.org/packages/rhel/$releasever/$basearch/
gpgcheck=1
enabled=1
gpgkey=https://nginx.org/keys/nginx_signing.key
module_hotfixes=true
EOF

Install the package:

sudo dnf install -y nginx

On Ubuntu / Debian

Pull in the dependencies and the nginx signing key:

sudo apt update
sudo apt install -y curl gnupg2 ca-certificates lsb-release
curl -fsSL https://nginx.org/keys/nginx_signing.key | \
  sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/nginx.gpg

Add the repo. Pick the line that matches your distro:

# Ubuntu
echo "deb https://nginx.org/packages/mainline/ubuntu $(lsb_release -cs) nginx" | \
  sudo tee /etc/apt/sources.list.d/nginx.list

# Debian
echo "deb https://nginx.org/packages/mainline/debian $(lsb_release -cs) nginx" | \
  sudo tee /etc/apt/sources.list.d/nginx.list

Install nginx:

sudo apt update
sudo apt install -y nginx

Confirm stream is available

Grep the build flags for the stream module. All three flags should be present:

nginx -V 2>&1 | tr ' ' '\n' | grep -i stream

Real output from the test host:

--with-stream
--with-stream_realip_module
--with-stream_ssl_module

If the grep returns nothing, you are not on the official nginx build. On Debian/Ubuntu, apt install libnginx-mod-stream adds the module. On any distro, switching to the nginx.org repo is the cleanest fix.

Configure nginx as a TCP load balancer for 6443

The stream block lives at the top level of nginx.conf, not inside http {}. Rather than cramming everything into one file, we add a dedicated /etc/nginx/stream.d/ include directory. That keeps the apiserver config separate from anything else nginx is fronting and makes drop-in additions painless later.

Create the include directory:

sudo mkdir -p /etc/nginx/stream.d

Open the main config:

sudo vi /etc/nginx/nginx.conf

Make it look like this. Note the stream block sits outside http. The worker_rlimit_nofile bump matters because the default systemd unit caps file descriptors at 1024, and nginx will complain at start otherwise:

user  nginx;
worker_processes      auto;
worker_rlimit_nofile  65535;
error_log  /var/log/nginx/error.log notice;
pid        /var/run/nginx.pid;

events {
    worker_connections  4096;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    access_log    /var/log/nginx/access.log;
    sendfile      on;
    keepalive_timeout  65;
    include /etc/nginx/conf.d/*.conf;
}

stream {
    log_format kube_lb '$remote_addr [$time_local] '
                      '$protocol $status $bytes_sent $bytes_received '
                      '$session_time "$upstream_addr" '
                      '"$upstream_connect_time"';

    include /etc/nginx/stream.d/*.conf;
}

Now the apiserver-specific config. Replace the three upstream IPs with your actual control plane node addresses:

sudo vi /etc/nginx/stream.d/kube-apiserver.conf

Drop in the following stanza:

upstream kube_apiservers {
    least_conn;

    server 10.0.1.10:6443  max_fails=3 fail_timeout=10s;
    server 10.0.1.11:6443  max_fails=3 fail_timeout=10s;
    server 10.0.1.12:6443  max_fails=3 fail_timeout=10s;
}

server {
    listen 6443;

    proxy_pass            kube_apiservers;
    proxy_connect_timeout 2s;
    proxy_timeout         10m;

    proxy_next_upstream        on;
    proxy_next_upstream_tries  2;

    access_log /var/log/nginx/kube-lb.log kube_lb;
}

A few details worth understanding before you move on:

least_conn sends the next connection to the apiserver with the fewest active sessions. kubelets hold a long-lived watch stream to the API, so round-robin can pin a hot node. least_conn spreads the load more evenly in practice.
max_fails and fail_timeout are passive health checks. If three connection attempts to an upstream fail within 10 seconds, nginx takes it out of rotation for 10 seconds, then tries again. Active health checks are an Nginx Plus feature. If you need active probing in the OSS world, use HAProxy instead (Rocky/Alma or Debian/Ubuntu).
proxy_connect_timeout 2s fails fast on a dead apiserver rather than making clients wait the default 60 seconds.
proxy_timeout 10m keeps long-lived watch connections alive. Anything shorter and kubelets will reconnect constantly, which spams the audit log and adds unnecessary apiserver load.
proxy_next_upstream lets nginx retry the next upstream if a connection attempt fails. Without it, a client sees the error directly.

Validate the combined config:

sudo nginx -t

A clean test prints both lines:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Open port 6443 on the firewall

On firewalld (Rocky/Alma/RHEL):

sudo firewall-cmd --permanent --add-port=6443/tcp
sudo firewall-cmd --reload
sudo firewall-cmd --list-ports

On UFW (Ubuntu/Debian):

sudo ufw allow 6443/tcp
sudo ufw reload
sudo ufw status

Label the port for SELinux

This step is Rocky/Alma/RHEL only. Ubuntu and Debian ship AppArmor and the default nginx profile does not restrict which TCP ports it binds. On SELinux-enforcing systems, nginx cannot bind to a port outside the http_port_t label set, and 6443 is not in there by default. You will see an AVC denial in /var/log/audit/audit.log and nginx will fail to start if you skip this.

Install the policy tools, add the port label, and confirm:

sudo dnf install -y policycoreutils-python-utils
sudo semanage port -a -t http_port_t -p tcp 6443
sudo semanage port -l | grep '^http_port_t' | head -1

The grep confirms 6443 now lives in the HTTP port set:

http_port_t                    tcp      6443, 80, 81, 443, 488, 8008, 8009, 8443, 9000

Nginx also needs permission to make outbound network connections so it can reach the apiserver upstreams:

sudo setsebool -P httpd_can_network_connect 1
sudo getsebool httpd_can_network_connect

Expected output: httpd_can_network_connect --> on.

Start nginx and verify it is listening

Enable and start the service:

sudo systemctl enable --now nginx
systemctl is-active nginx

The second command should print active. Confirm nginx is actually bound to 6443 rather than just running:

sudo ss -tlnp | grep 6443

On the test host:

LISTEN 0  511  0.0.0.0:6443  0.0.0.0:*  users:(("nginx",pid=9245,fd=7),("nginx",pid=9244,fd=7),("nginx",pid=9243,fd=7))

Three nginx workers holding fd=7 on 0.0.0.0:6443. That is what a working stream listener looks like. If only the master process is listed, the stream config never actually loaded. Check nginx -T and confirm your stream.d/ file was included.

Sanity-check the LB against a real apiserver

Assuming at least one control plane node is up, hit the apiserver through the LB from a client that has the cluster CA:

curl --cacert /etc/kubernetes/pki/ca.crt https://k8s-api.example.com:6443/livez?verbose

A healthy apiserver returns a list of subsystem checks, each ending in [+]...ok, and a trailing livez check passed. If you get a TLS error about the hostname, your kubeadm control-plane-endpoint was set to an IP instead of the DNS name. Regenerate the apiserver certificate with the name added to its SAN list using kubeadm certs renew after editing kubeadm-config.

To watch real traffic flow, tail the stream access log:

sudo tail -f /var/log/nginx/kube-lb.log

Each connection shows remote IP, status, bytes in and out, session time, and which upstream served it. When you kill an apiserver, watch the upstream field shift to the survivors within 10 seconds, which is the fail_timeout window.

Point kubeadm at the load balancer

On a fresh cluster, initialize the first control plane node with the LB DNS name as the control-plane endpoint:

sudo kubeadm init \
  --control-plane-endpoint "k8s-api.example.com:6443" \
  --upload-certs \
  --pod-network-cidr=192.168.0.0/16

The --upload-certs flag stages the control plane certificates in a cluster secret so the other two control planes can pull them on join. For the full HA topology walkthrough, our HA Kubernetes cluster with kubeadm guide covers the join commands, etcd topology choices, and post-bootstrap checks.

Remove the LB as a single point of failure

One nginx load balancer has the same blast radius as zero control planes. Lose the LB host and every kubelet blocks until it comes back. For anything beyond a home lab, run two LB hosts and put a floating VIP in front of them with keepalived. The short version:

Install the same nginx stream config on two LB hosts. They are identical and stateless.
Install keepalived on both. Configure a VRRP instance with a priority of 150 on the primary and 100 on the secondary, sharing a virtual IP on the LAN.
Add a vrrp_script that fails the instance if nginx is not listening on 6443, so keepalived moves the VIP to the healthy peer within a few seconds.
Point DNS (k8s-api.example.com) and all kubelet configs at the VIP, not either individual host.

The nginx process on each host still load-balances across the same three apiservers. keepalived only decides which of the two LB hosts owns the client-facing VIP at any moment. An alternative if you already run a hardware load balancer or a managed NLB (F5, MetalLB BGP, AWS NLB, and similar) is to point kubeadm at that and skip nginx entirely. On-prem with plain servers, nginx plus keepalived is the path of least surprise.

Troubleshooting

Error: “nginx: [emerg] unknown directive ‘stream'”

The nginx build in use was not compiled with --with-stream. This usually means the distro’s stripped-down package is installed. Run nginx -V and check the flags. The fix is either apt install libnginx-mod-stream (on Ubuntu/Debian if you stayed on the distro package) or switching to the official nginx.org repo and reinstalling.

Error: “bind() to 0.0.0.0:6443 failed (13: Permission denied)”

SELinux denied the bind because the port was never labeled http_port_t. Confirm with ausearch -m avc -ts recent and add the label:

sudo semanage port -a -t http_port_t -p tcp 6443
sudo systemctl restart nginx

Error: “bind() to 0.0.0.0:80 failed (98: Address already in use)”

Something else on the host is on port 80 (Apache, another nginx instance, a web app). If this LB is a dedicated host you do not need an HTTP listener at all. Delete /etc/nginx/conf.d/default.conf so nginx stops trying to bind port 80, then restart.

kubelets connect but every API call returns “connection refused” or times out

nginx accepted the TCP connection but could not reach any apiserver upstream. Check reachability from the LB host directly:

for ip in 10.0.1.10 10.0.1.11 10.0.1.12; do
  echo -n "$ip: "
  timeout 2 bash -c "</dev/tcp/$ip/6443" && echo OK || echo FAIL
done

If any of those fail, the issue is upstream firewall or SELinux on the control plane, not nginx. When every upstream is down, nginx still accepts the client TCP connection and then fails the proxy hop, which clients see as a reset.

nginx -t warns “worker_connections exceed open file resource limit”

Default systemd file-descriptor cap is 1024. We set worker_rlimit_nofile 65535 earlier, which raises it inside the worker processes. If the warning persists, also drop a systemd override:

sudo systemctl edit nginx

Add:

[Service]
LimitNOFILE=65535

Then sudo systemctl daemon-reload && sudo systemctl restart nginx.

The LB is now passing kube-apiserver traffic across three control plane nodes with passive health checks, TLS left intact end-to-end, firewalld and SELinux both configured for 6443, and a clear path to full HA with keepalived. From here, two follow-ups pay off immediately: wire up a periodic etcd backup and restore schedule so the cluster state is actually recoverable, and run a failover drill where you drop a control plane node and confirm kubelets stay connected through the LB. For the full HA kubeadm topology that sits behind this load balancer, see our Kubernetes on Rocky/AlmaLinux kubeadm guide and the multi-cluster kubectl + kubectx walkthrough.