Kubernetes YAML is the most common thing engineers ask AI to generate. The manifests are verbose, the schema is strict, and getting resource limits, probes, and selectors right on the first try is rare. Claude Code handles this well because Kubernetes objects follow predictable patterns. It also diagnoses failing deployments by reading kubectl describe output and pod logs, which is where most debugging time goes.
This guide is part of the Claude Code for DevOps Engineers series. Every demo ran on a real Kubernetes 1.35.0 cluster (kind on Docker Desktop). The pod names, events, error messages, and Helm history are from actual execution. For kubectl reference, the kubectl cheat sheet covers the commands used throughout this guide.
Verified working: March 2026 on Kubernetes 1.35.0 (kind), Helm 3.19.1, Nginx 1.27-alpine
What You Need
- Claude Code installed and authenticated
- A Kubernetes cluster (kind, minikube, Docker Desktop, or a cloud cluster)
- kubectl 1.28+ and Helm 3.12+ installed
- Use a non-production cluster for these demos (they create and delete deployments)
To spin up a disposable cluster with kind:
kind create cluster --name demo
This creates a single-node Kubernetes cluster inside Docker in about 30 seconds.
Deploy a Full Application Stack
The first demo generates a complete application deployment: namespace, ConfigMap, Deployment with probes and resource limits, Service, and NetworkPolicy. One prompt produces production-grade manifests.
Generate Kubernetes manifests for a web application:
- Namespace called "demo"
- ConfigMap with a custom HTML index page
- Deployment with 3 replicas of nginx:1.27-alpine
- Resource requests (50m CPU, 64Mi memory) and limits (200m, 128Mi)
- Liveness and readiness probes on port 80
- ClusterIP Service on port 80
- NetworkPolicy allowing only port 80 ingress
Validate with --dry-run=client, then apply.
Claude Code generates five resources in a single YAML file. The dry run confirms they’re valid:
kubectl apply -f app.yaml --dry-run=client
All five resources pass validation:
namespace/demo configured (dry run)
configmap/nginx-config created (dry run)
deployment.apps/web created (dry run)
service/web created (dry run)
networkpolicy.networking.k8s.io/web-allow-http created (dry run)
After applying, all three replicas come up and pass their readiness probes:
kubectl get pods -n demo
Three pods running:
NAME READY STATUS RESTARTS AGE
web-55dd865988-6m47w 1/1 Running 0 38s
web-55dd865988-8r24q 1/1 Running 0 38s
web-55dd865988-ns7f7 1/1 Running 0 38s
Port-forwarding to the service confirms the custom HTML is served:
kubectl port-forward -n demo svc/web 8888:80 &
curl -s http://localhost:8888/
The response shows our ConfigMap-mounted HTML:
<html>
<head><title>Claude Code K8s Demo</title></head>
<body>
<h1>Deployed by Claude Code on Kubernetes</h1>
<p>Cluster: kind-claude-demo</p>
<p>Namespace: demo</p>
</body>
</html>
Notice what Claude Code included by default: resource requests AND limits (most hand-written manifests skip requests), both probe types with sensible timing, a dedicated namespace (not default), and a NetworkPolicy restricting ingress to port 80. These are production best practices that most engineers add after the first incident, not during initial deployment.
Debug a Failing Deployment
Two real failure scenarios that account for most Kubernetes debugging time. Both were captured from an actual cluster.
ImagePullBackOff: Wrong image tag
A deployment references nginx:99.99-nonexistent. The pod gets stuck:
kubectl get pods -n demo -l app=broken-app
The status says it all:
NAME READY STATUS RESTARTS AGE
broken-app-7cbf6548f8-sf76x 0/1 ImagePullBackOff 0 15s
Tell Claude Code “my pod is stuck in ImagePullBackOff” and it runs kubectl describe pod to read the events:
Events:
Warning Failed 13s kubelet Failed to pull image "nginx:99.99-nonexistent":
rpc error: code = NotFound desc = failed to pull and unpack image
"docker.io/library/nginx:99.99-nonexistent": not found
Warning Failed 13s kubelet Error: ErrImagePull
Normal BackOff 13s kubelet Back-off pulling image "nginx:99.99-nonexistent"
Warning Failed 13s kubelet Error: ImagePullBackOff
The error is unambiguous: the image tag doesn’t exist on Docker Hub. Claude Code fixes the tag to a valid version (nginx:1.27-alpine), patches the deployment, and waits for the rollout to complete. The entire diagnosis takes under 10 seconds because the events contain the full error path.
CrashLoopBackOff: Container exits on startup
A container starts, prints an error, and exits. Kubernetes restarts it, it crashes again, and the back-off timer increases. The pod cycles between Error and CrashLoopBackOff.
NAME READY STATUS RESTARTS AGE
crash-app-6d99cd5cf8-tkkfl 0/1 Error 4 (66s ago) 105s
Claude Code reads the pod logs to find the actual error:
kubectl logs crash-app-6d99cd5cf8-tkkfl -n demo
The log reveals the root cause:
Missing config: /etc/app/config.yml not found
The events confirm the crash loop pattern:
Events:
Normal Pulled 56s (x5 over 2m32s) kubelet Container image already present
Normal Created 56s (x5 over 2m32s) kubelet Container created
Normal Started 56s (x5 over 2m32s) kubelet Container started
Warning BackOff 56s (x4 over 2m30s) kubelet Back-off restarting failed container
Five creates, five starts, four back-offs. The container starts successfully but immediately exits because it can’t find its config file. Claude Code fixes this by creating a ConfigMap with the expected config and mounting it at /etc/app/config.yml. The deployment rolls out with the volume mount, and the crash loop stops.
The debugging pattern is always the same: kubectl get pods (see the status), kubectl describe pod (read events), kubectl logs (find the application error). Claude Code runs all three in sequence without being told.
Generate and Manage Helm Charts
Raw manifests work for one-off deployments. Helm charts work for repeatable deployments across environments. Claude Code converts raw manifests to Helm charts and manages the release lifecycle.
Convert the web deployment manifests into a Helm chart with
configurable replicas, image tag, and resource limits.
Install it, then upgrade to 4 replicas.
Claude Code creates the chart structure with Chart.yaml, values.yaml, and templated manifests under templates/. The values file exposes the parameters you’ll change between environments:
replicaCount: 2
image:
repository: nginx
tag: "1.27-alpine"
service:
type: ClusterIP
port: 80
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 128Mi
Installing the chart:
helm install my-web ./web-chart -n demo
Helm confirms the deployment:
NAME: my-web
LAST DEPLOYED: Sat Mar 28 10:39:03 2026
NAMESPACE: demo
STATUS: deployed
REVISION: 1
Upgrading to 4 replicas with a single flag:
helm upgrade my-web ./web-chart -n demo --set replicaCount=4
Four pods now running:
NAME READY STATUS RESTARTS AGE
my-web-64ddbdb848-8bjj2 1/1 Running 0 8s
my-web-64ddbdb848-gwxmm 1/1 Running 0 15s
my-web-64ddbdb848-pzs67 1/1 Running 0 8s
my-web-64ddbdb848-qn8vs 1/1 Running 0 15s
Rollback a bad Helm upgrade
Push a broken image tag through Helm and the new pods fail with ErrImagePull while the old ones keep running (Kubernetes’ rolling update protects you). Claude Code detects the failure and rolls back:
helm upgrade my-web ./web-chart -n demo --set image.tag=99.99-broken
The new pod fails while old pods continue serving:
NAME READY STATUS RESTARTS AGE
my-web-574474d994-8mpdd 0/1 ErrImagePull 0 11s
my-web-64ddbdb848-8bjj2 1/1 Running 0 29s
my-web-64ddbdb848-pzs67 1/1 Running 0 29s
Claude Code identifies the ErrImagePull and rolls back to the last working revision:
helm rollback my-web 2 -n demo
Rollback confirmed:
Rollback was a success! Happy Helming!
The Helm history shows the full lifecycle:
REVISION UPDATED STATUS CHART DESCRIPTION
1 Sat Mar 28 10:39:03 2026 superseded web-app-0.1.0 Install complete
2 Sat Mar 28 10:39:11 2026 superseded web-app-0.1.0 Upgrade complete
3 Sat Mar 28 10:39:29 2026 superseded web-app-0.1.0 Upgrade complete
4 Sat Mar 28 10:39:40 2026 deployed web-app-0.1.0 Rollback to 2
Four revisions: initial install, scale to 4 replicas, broken upgrade, rollback to revision 2. Every state is tracked and reversible. This is why Helm matters for production: kubectl apply has no built-in rollback.
Kubernetes Debugging Cheat Sheet
After debugging dozens of deployments through Claude Code, these are the status patterns and what they mean:
| Status | Meaning | First Command to Run |
|---|---|---|
| ImagePullBackOff | Image tag doesn’t exist or registry auth failed | kubectl describe pod (check Events) |
| CrashLoopBackOff | Container starts then crashes repeatedly | kubectl logs POD (find the app error) |
| Pending | No node can schedule the pod (resource limits, taints, affinity) | kubectl describe pod (check Conditions) |
| ContainerCreating | Image pulling or volume mounting (stuck = check events) | kubectl describe pod (check Events) |
| OOMKilled | Container exceeded memory limit | Increase resources.limits.memory |
| Evicted | Node ran out of disk or memory | kubectl describe node (check Conditions) |
| CreateContainerError | Missing ConfigMap, Secret, or volume mount | kubectl describe pod (check Events) |
Claude Code follows this same decision tree. When you say “my pod is failing,” it checks the status first, then runs the appropriate describe or logs command based on what it finds.
What Claude Code Gets Right (and Wrong) with Kubernetes
| Task | Quality | Notes |
|---|---|---|
| Deployments + Services | Excellent | Correct selectors, labels, resource specs every time |
| Probes | Excellent | Adds both liveness and readiness with sensible timing |
| Resource limits | Excellent | Includes both requests and limits (rare in hand-written manifests) |
| ConfigMaps and Secrets | Good | Correct mounting, sometimes forgets readOnly: true |
| Helm chart generation | Good | Proper templating, sensible values.yaml defaults |
| Debugging (describe + logs) | Excellent | Follows the right diagnostic sequence every time |
| NetworkPolicies | Fair | Gets basic ingress rules right, complex egress needs review |
| RBAC | Fair | Generates overly permissive roles, always tighten manually |
| Ingress with TLS | Good | Knows cert-manager annotations, sometimes outdated API versions |
| StatefulSets | Good | Correct volumeClaimTemplates, sometimes misses headless service |
Practical Kubernetes Prompts
Specify the namespace. Claude Code uses default if you don’t specify. Always include “in namespace X” to keep demo resources isolated from production workloads.
Ask for both probes and resource limits upfront. “Deploy nginx with liveness probe, readiness probe, resource requests and limits” produces a complete manifest. Without this, Claude Code sometimes generates minimal manifests that pass validation but lack production essentials.
For debugging, paste the pod name and namespace. “Pod web-55dd865988-6m47w in namespace demo is in CrashLoopBackOff” gives Claude Code everything it needs to run kubectl describe and kubectl logs with the correct arguments. Just saying “my pod is crashing” forces it to list pods first.
Use Helm for anything you’ll deploy more than once. Ask Claude Code to “convert these manifests to a Helm chart” and it creates the chart structure with parameterized values. Helm gives you rollback, revision history, and environment-specific overrides that raw kubectl apply lacks.
Part of the Claude Code for DevOps Series
This Kubernetes spoke connects to the broader series. The Docker guide covers building the container images that Kubernetes deploys. The Terraform guide covers provisioning the clusters themselves. For deploying a full cluster from scratch, the Kubernetes cluster deployment with Rancher guide covers that.
- Set Up Claude Code for DevOps Engineers (pillar with safety rules and permissions)
- Manage Servers with Claude Code via SSH
- Build and Debug Docker Containers with Claude Code
- Deploy Infrastructure with Claude Code and Terraform
- Generate and Debug Ansible Playbooks with Claude Code
- Claude Code + GitHub Actions: automated PR review, infrastructure validation
The Claude Code cheat sheet covers every command and shortcut for quick reference.