Deploy and Debug Kubernetes Apps with Claude Code [Hands-On]

Kubernetes YAML is the most common thing engineers ask AI to generate. The manifests are verbose, the schema is strict, and getting resource limits, probes, and selectors right on the first try is rare. Claude Code handles this well because Kubernetes objects follow predictable patterns. It also diagnoses failing deployments by reading kubectl describe output and pod logs, which is where most debugging time goes.

Original content from computingforgeeks.com - post 164937

This guide is part of the Claude Code for DevOps Engineers series. Every demo ran on a real Kubernetes 1.35.0 cluster (kind on Docker Desktop). The pod names, events, error messages, and Helm history are from actual execution. For kubectl reference, the kubectl cheat sheet covers the commands used throughout this guide.

Verified working: March 2026 on Kubernetes 1.35.0 (kind), Helm 3.19.1, Nginx 1.27-alpine

What You Need

Claude Code installed and authenticated
A Kubernetes cluster (kind, minikube, Docker Desktop, or a cloud cluster)
kubectl 1.28+ and Helm 3.12+ installed
Use a non-production cluster for these demos (they create and delete deployments)

To spin up a disposable cluster with kind:

kind create cluster --name demo

This creates a single-node Kubernetes cluster inside Docker in about 30 seconds.

Deploy a Full Application Stack

The first demo generates a complete application deployment: namespace, ConfigMap, Deployment with probes and resource limits, Service, and NetworkPolicy. One prompt produces production-grade manifests.

Generate Kubernetes manifests for a web application:
- Namespace called "demo"
- ConfigMap with a custom HTML index page
- Deployment with 3 replicas of nginx:1.27-alpine
- Resource requests (50m CPU, 64Mi memory) and limits (200m, 128Mi)
- Liveness and readiness probes on port 80
- ClusterIP Service on port 80
- NetworkPolicy allowing only port 80 ingress
Validate with --dry-run=client, then apply.

Claude Code generates five resources in a single YAML file. The dry run confirms they’re valid:

kubectl apply -f app.yaml --dry-run=client

All five resources pass validation:

namespace/demo configured (dry run)
configmap/nginx-config created (dry run)
deployment.apps/web created (dry run)
service/web created (dry run)
networkpolicy.networking.k8s.io/web-allow-http created (dry run)

After applying, all three replicas come up and pass their readiness probes:

kubectl get pods -n demo

Three pods running:

NAME                   READY   STATUS    RESTARTS   AGE
web-55dd865988-6m47w   1/1     Running   0          38s
web-55dd865988-8r24q   1/1     Running   0          38s
web-55dd865988-ns7f7   1/1     Running   0          38s

Port-forwarding to the service confirms the custom HTML is served:

kubectl port-forward -n demo svc/web 8888:80 &
curl -s http://localhost:8888/

The response shows our ConfigMap-mounted HTML:

<html>
<head><title>Claude Code K8s Demo</title></head>
<body>
<h1>Deployed by Claude Code on Kubernetes</h1>
<p>Cluster: kind-claude-demo</p>
<p>Namespace: demo</p>
</body>
</html>

Notice what Claude Code included by default: resource requests AND limits (most hand-written manifests skip requests), both probe types with sensible timing, a dedicated namespace (not default), and a NetworkPolicy restricting ingress to port 80. These are production best practices that most engineers add after the first incident, not during initial deployment.

Debug a Failing Deployment

Two real failure scenarios that account for most Kubernetes debugging time. Both were captured from an actual cluster.

ImagePullBackOff: Wrong image tag

A deployment references nginx:99.99-nonexistent. The pod gets stuck:

kubectl get pods -n demo -l app=broken-app

The status says it all:

NAME                          READY   STATUS             RESTARTS   AGE
broken-app-7cbf6548f8-sf76x   0/1     ImagePullBackOff   0          15s

Tell Claude Code “my pod is stuck in ImagePullBackOff” and it runs kubectl describe pod to read the events:

Events:
  Warning  Failed   13s  kubelet  Failed to pull image "nginx:99.99-nonexistent":
    rpc error: code = NotFound desc = failed to pull and unpack image
    "docker.io/library/nginx:99.99-nonexistent": not found
  Warning  Failed   13s  kubelet  Error: ErrImagePull
  Normal   BackOff  13s  kubelet  Back-off pulling image "nginx:99.99-nonexistent"
  Warning  Failed   13s  kubelet  Error: ImagePullBackOff

The error is unambiguous: the image tag doesn’t exist on Docker Hub. Claude Code fixes the tag to a valid version (nginx:1.27-alpine), patches the deployment, and waits for the rollout to complete. The entire diagnosis takes under 10 seconds because the events contain the full error path.

CrashLoopBackOff: Container exits on startup

A container starts, prints an error, and exits. Kubernetes restarts it, it crashes again, and the back-off timer increases. The pod cycles between Error and CrashLoopBackOff.

NAME                         READY   STATUS   RESTARTS      AGE
crash-app-6d99cd5cf8-tkkfl   0/1     Error    4 (66s ago)   105s

Claude Code reads the pod logs to find the actual error:

kubectl logs crash-app-6d99cd5cf8-tkkfl -n demo

The log reveals the root cause:

Missing config: /etc/app/config.yml not found

The events confirm the crash loop pattern:

Events:
  Normal   Pulled   56s (x5 over 2m32s)  kubelet  Container image already present
  Normal   Created  56s (x5 over 2m32s)  kubelet  Container created
  Normal   Started  56s (x5 over 2m32s)  kubelet  Container started
  Warning  BackOff  56s (x4 over 2m30s)  kubelet  Back-off restarting failed container

Five creates, five starts, four back-offs. The container starts successfully but immediately exits because it can’t find its config file. Claude Code fixes this by creating a ConfigMap with the expected config and mounting it at /etc/app/config.yml. The deployment rolls out with the volume mount, and the crash loop stops.

The debugging pattern is always the same: kubectl get pods (see the status), kubectl describe pod (read events), kubectl logs (find the application error). Claude Code runs all three in sequence without being told.

Generate and Manage Helm Charts

Raw manifests work for one-off deployments. Helm charts work for repeatable deployments across environments. Claude Code converts raw manifests to Helm charts and manages the release lifecycle.

Convert the web deployment manifests into a Helm chart with
configurable replicas, image tag, and resource limits.
Install it, then upgrade to 4 replicas.

Claude Code creates the chart structure with Chart.yaml, values.yaml, and templated manifests under templates/. The values file exposes the parameters you’ll change between environments:

replicaCount: 2
image:
  repository: nginx
  tag: "1.27-alpine"
service:
  type: ClusterIP
  port: 80
resources:
  requests:
    cpu: 50m
    memory: 64Mi
  limits:
    cpu: 200m
    memory: 128Mi

Installing the chart:

helm install my-web ./web-chart -n demo

Helm confirms the deployment:

NAME: my-web
LAST DEPLOYED: Sat Mar 28 10:39:03 2026
NAMESPACE: demo
STATUS: deployed
REVISION: 1

Upgrading to 4 replicas with a single flag:

helm upgrade my-web ./web-chart -n demo --set replicaCount=4

Four pods now running:

NAME                      READY   STATUS    RESTARTS   AGE
my-web-64ddbdb848-8bjj2   1/1     Running   0          8s
my-web-64ddbdb848-gwxmm   1/1     Running   0          15s
my-web-64ddbdb848-pzs67   1/1     Running   0          8s
my-web-64ddbdb848-qn8vs   1/1     Running   0          15s

Rollback a bad Helm upgrade

Push a broken image tag through Helm and the new pods fail with ErrImagePull while the old ones keep running (Kubernetes’ rolling update protects you). Claude Code detects the failure and rolls back:

helm upgrade my-web ./web-chart -n demo --set image.tag=99.99-broken

The new pod fails while old pods continue serving:

NAME                      READY   STATUS         RESTARTS   AGE
my-web-574474d994-8mpdd   0/1     ErrImagePull   0          11s
my-web-64ddbdb848-8bjj2   1/1     Running        0          29s
my-web-64ddbdb848-pzs67   1/1     Running        0          29s

Claude Code identifies the ErrImagePull and rolls back to the last working revision:

helm rollback my-web 2 -n demo

Rollback confirmed:

Rollback was a success! Happy Helming!

The Helm history shows the full lifecycle:

REVISION  UPDATED                  STATUS      CHART          DESCRIPTION
1         Sat Mar 28 10:39:03 2026 superseded  web-app-0.1.0  Install complete
2         Sat Mar 28 10:39:11 2026 superseded  web-app-0.1.0  Upgrade complete
3         Sat Mar 28 10:39:29 2026 superseded  web-app-0.1.0  Upgrade complete
4         Sat Mar 28 10:39:40 2026 deployed    web-app-0.1.0  Rollback to 2

Four revisions: initial install, scale to 4 replicas, broken upgrade, rollback to revision 2. Every state is tracked and reversible. This is why Helm matters for production: kubectl apply has no built-in rollback.

Kubernetes Debugging Cheat Sheet

After debugging dozens of deployments through Claude Code, these are the status patterns and what they mean:

Status	Meaning	First Command to Run
ImagePullBackOff	Image tag doesn’t exist or registry auth failed	`kubectl describe pod` (check Events)
CrashLoopBackOff	Container starts then crashes repeatedly	`kubectl logs POD` (find the app error)
Pending	No node can schedule the pod (resource limits, taints, affinity)	`kubectl describe pod` (check Conditions)
ContainerCreating	Image pulling or volume mounting (stuck = check events)	`kubectl describe pod` (check Events)
OOMKilled	Container exceeded memory limit	Increase `resources.limits.memory`
Evicted	Node ran out of disk or memory	`kubectl describe node` (check Conditions)
CreateContainerError	Missing ConfigMap, Secret, or volume mount	`kubectl describe pod` (check Events)

Claude Code follows this same decision tree. When you say “my pod is failing,” it checks the status first, then runs the appropriate describe or logs command based on what it finds.

What Claude Code Gets Right (and Wrong) with Kubernetes

Task	Quality	Notes
Deployments + Services	Excellent	Correct selectors, labels, resource specs every time
Probes	Excellent	Adds both liveness and readiness with sensible timing
Resource limits	Excellent	Includes both requests and limits (rare in hand-written manifests)
ConfigMaps and Secrets	Good	Correct mounting, sometimes forgets `readOnly: true`
Helm chart generation	Good	Proper templating, sensible values.yaml defaults
Debugging (describe + logs)	Excellent	Follows the right diagnostic sequence every time
NetworkPolicies	Fair	Gets basic ingress rules right, complex egress needs review
RBAC	Fair	Generates overly permissive roles, always tighten manually
Ingress with TLS	Good	Knows cert-manager annotations, sometimes outdated API versions
StatefulSets	Good	Correct volumeClaimTemplates, sometimes misses headless service

Practical Kubernetes Prompts

Specify the namespace. Claude Code uses default if you don’t specify. Always include “in namespace X” to keep demo resources isolated from production workloads.

Ask for both probes and resource limits upfront. “Deploy nginx with liveness probe, readiness probe, resource requests and limits” produces a complete manifest. Without this, Claude Code sometimes generates minimal manifests that pass validation but lack production essentials.

For debugging, paste the pod name and namespace. “Pod web-55dd865988-6m47w in namespace demo is in CrashLoopBackOff” gives Claude Code everything it needs to run kubectl describe and kubectl logs with the correct arguments. Just saying “my pod is crashing” forces it to list pods first.

Use Helm for anything you’ll deploy more than once. Ask Claude Code to “convert these manifests to a Helm chart” and it creates the chart structure with parameterized values. Helm gives you rollback, revision history, and environment-specific overrides that raw kubectl apply lacks.

Part of the Claude Code for DevOps Series

This Kubernetes spoke connects to the broader series. The Docker guide covers building the container images that Kubernetes deploys. The Terraform guide covers provisioning the clusters themselves. For deploying a full cluster from scratch, the Kubernetes cluster deployment with Rancher guide covers that.

Set Up Claude Code for DevOps Engineers (pillar with safety rules and permissions)
Manage Servers with Claude Code via SSH
Build and Debug Docker Containers with Claude Code
Deploy Infrastructure with Claude Code and Terraform
Generate and Debug Ansible Playbooks with Claude Code
Claude Code + GitHub Actions: automated PR review, infrastructure validation

The Claude Code cheat sheet covers every command and shortcut for quick reference.