GKE Autopilot removes the node management overhead entirely. Google provisions, scales, and patches the nodes. You deploy pods, specify resource requests, and pay per pod (not per node). The tradeoff is less control: no DaemonSets, no privileged containers, no SSH to nodes. For most production workloads, that tradeoff is worth it.
This guide provisions a production-grade Autopilot cluster with Terraform, including a custom VPC with secondary ranges, Cloud NAT for outbound traffic, private cluster networking, and Workload Identity Federation. We cover the resource mutation behavior that catches most teams off guard (especially with GitOps tools like ArgoCD), plus the real cost model with numbers from our test cluster.
Verified working: April 2026 on GKE Autopilot 1.35.1-gke.1396002, Regular channel, europe-west1, Terraform 1.12
Autopilot vs Standard: When to Choose What
The choice depends on how much node-level control you actually need. Here is a comparison based on real production usage:
| Criteria | Autopilot | Standard |
|---|---|---|
| Node management | Fully managed by Google | You manage node pools |
| Pricing model | Per-pod CPU/memory/GPU + $0.10/hr management | Per-node VM cost + $0.10/hr management |
| Free tier | $74.40/mo management fee credit | One zonal cluster free (no management fee) |
| DaemonSets | Not supported | Supported |
| Privileged containers | Not allowed | Allowed |
| Node SSH | Not available | Available |
| Resource requests | Required on every container | Optional (but recommended) |
| Min pod resources | 250m CPU, 512Mi memory | No minimum |
| Workload Identity | Always enabled, cannot disable | Optional |
| Scale to zero | Yes (nodes disappear when no pods) | Min 1 node per node pool |
| Best for | Stateless apps, APIs, batch jobs | Stateful workloads, custom node configs, GPU ML |
For a detailed breakdown of GKE pricing components, including egress, persistent disks, and load balancers, see our GCP costs guide.
Prerequisites
- A GCP project with billing enabled
- Terraform 1.5+ and the
gcloudCLI installed container.googleapis.comandcompute.googleapis.comAPIs enabled- IAM permissions:
roles/container.admin,roles/compute.networkAdmin,roles/iam.serviceAccountAdmin - Tested on: Terraform 1.12, google provider 6.x, GKE 1.35.1
Enable the required APIs:
gcloud services enable container.googleapis.com compute.googleapis.com
Create the VPC and Subnets
GKE Autopilot requires a VPC with secondary IP ranges for pods and services. This Terraform configuration creates the network foundation with Cloud NAT for outbound internet access (required for private clusters) and Private Google Access so pods can reach Google APIs without a public IP.
Create vpc.tf:
resource "google_compute_network" "main" {
name = "cfg-lab-vpc"
auto_create_subnetworks = false
project = var.project_id
}
resource "google_compute_subnetwork" "gke" {
name = "cfg-lab-gke-subnet"
ip_cidr_range = "10.10.0.0/20"
region = var.region
network = google_compute_network.main.id
project = var.project_id
private_ip_google_access = true
secondary_ip_range {
range_name = "pods"
ip_cidr_range = "10.20.0.0/14"
}
secondary_ip_range {
range_name = "services"
ip_cidr_range = "10.24.0.0/20"
}
}
resource "google_compute_router" "router" {
name = "cfg-lab-router"
region = var.region
network = google_compute_network.main.id
project = var.project_id
}
resource "google_compute_router_nat" "nat" {
name = "cfg-lab-nat"
router = google_compute_router.router.name
region = var.region
project = var.project_id
nat_ip_allocate_option = "AUTO_ONLY"
source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
log_config {
enable = false
filter = "ERRORS_ONLY"
}
}
The secondary ranges are sized generously: /14 for pods gives roughly 260,000 pod IPs, and /20 for services gives 4,096. Autopilot handles IP allocation within these ranges automatically.
Create the Autopilot Cluster
Create gke.tf with the cluster definition. The key setting is enable_autopilot = true:
resource "google_container_cluster" "autopilot" {
name = "cfg-lab-gke"
location = var.region
project = var.project_id
enable_autopilot = true
network = google_compute_network.main.id
subnetwork = google_compute_subnetwork.gke.id
ip_allocation_policy {
cluster_secondary_range_name = "pods"
services_secondary_range_name = "services"
}
private_cluster_config {
enable_private_nodes = true
enable_private_endpoint = false
master_ipv4_cidr_block = "172.16.0.0/28"
}
release_channel {
channel = "REGULAR"
}
master_authorized_networks_config {
cidr_blocks {
cidr_block = "10.0.0.0/8"
display_name = "Internal"
}
cidr_blocks {
cidr_block = "0.0.0.0/0"
display_name = "Public access"
}
}
deletion_protection = false
}
A few things to note about this configuration. Setting enable_private_endpoint = false keeps the API server accessible from the internet (filtered by master_authorized_networks_config). In a production setup, you would restrict cidr_blocks to your VPN or office IP ranges. The deletion_protection = false flag allows terraform destroy to work cleanly during testing.
Add the variables and provider configuration. Create variables.tf:
variable "project_id" {
description = "GCP project ID"
type = string
}
variable "region" {
description = "GCP region"
type = string
default = "europe-west1"
}
Create providers.tf:
terraform {
required_version = ">= 1.5"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 6.0"
}
}
}
provider "google" {
project = var.project_id
region = var.region
}
Initialize and apply:
terraform init
terraform plan -var="project_id=PROJECT_ID"
terraform apply -var="project_id=PROJECT_ID" -auto-approve
The cluster takes about 8 to 10 minutes to provision. Once it is ready, configure kubectl:
gcloud container clusters get-credentials cfg-lab-gke --region europe-west1 --project PROJECT_ID
Verify the cluster is running:
kubectl cluster-info
The output confirms the cluster endpoint and CoreDNS are accessible:
Kubernetes control plane is running at https://172.16.0.2
GLBCDefaultBackend is running at https://172.16.0.2/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
KubeDNS is running at https://172.16.0.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Deploy an Application with Explicit Resources
Autopilot requires every container to declare CPU and memory requests. If you omit them, the Autopilot admission webhook injects defaults (250m CPU, 512Mi memory), which is usually not what you want. Always set them explicitly.
kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: nginx-demo
template:
metadata:
labels:
app: nginx-demo
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 500m
memory: 1Gi
---
apiVersion: v1
kind: Service
metadata:
name: nginx-demo
namespace: default
spec:
type: ClusterIP
selector:
app: nginx-demo
ports:
- port: 80
targetPort: 80
EOF
Autopilot provisions a node when pods are scheduled. On our test cluster, the node appeared in about 90 seconds:
kubectl get nodes
The node was provisioned on demand:
NAME STATUS ROLES AGE VERSION
gk3-cfg-lab-gke-pool-1-1d05a4de-pbmt Ready <none> 92s v1.35.1-gke.1396002
Confirm the pods are running:
kubectl get pods -l app=nginx-demo
Both replicas should show Running:
NAME READY STATUS RESTARTS AGE
nginx-demo-7d4f8b6c9f-k8x2p 1/1 Running 0 2m
nginx-demo-7d4f8b6c9f-m3n7q 1/1 Running 0 2m
Horizontal Pod Autoscaler
HPA works on Autopilot the same way as Standard. When HPA scales up and new pods cannot fit on existing nodes, Autopilot provisions additional nodes automatically.
kubectl autoscale deployment nginx-demo --cpu-percent=50 --min=2 --max=5
Check the HPA status:
kubectl get hpa
The initial CPU metric shows as unknown until the metrics server collects data (usually within 60 seconds):
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-demo Deployment/nginx-demo cpu: <unknown>/50% 2 5 2 15s
After a minute, the actual CPU utilization appears. The key difference from Standard mode is that Autopilot handles the node scaling transparently. You never create or configure a cluster autoscaler because it is built into the Autopilot control plane.
Workload Identity Federation
On Autopilot, Workload Identity is always enabled and cannot be turned off. The workload identity pool for our test cluster is PROJECT_ID.svc.id.goog. This means every pod automatically gets a Kubernetes service account identity that can be mapped to a GCP IAM service account.
To grant a pod access to GCP resources (Cloud Storage, BigQuery, Pub/Sub), bind a Kubernetes service account to a GCP service account:
gcloud iam service-accounts create app-sa --project=PROJECT_ID
gcloud iam service-accounts add-iam-policy-binding app-sa@PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[default/app-ksa]" \
--project=PROJECT_ID
Then annotate the Kubernetes service account:
kubectl create serviceaccount app-ksa
kubectl annotate serviceaccount app-ksa \
iam.gke.io/gcp-service-account=app-sa@PROJECT_ID.iam.gserviceaccount.com
Reference the service account in your Pod spec with serviceAccountName: app-ksa. For a deeper walkthrough of Workload Identity, including testing and debugging, see our GKE Workload Identity Federation guide.
The Autopilot Resource Mutation Trap
This catches almost every team that deploys to Autopilot with GitOps. When you submit a pod spec, Autopilot’s admission webhook mutates it. It adjusts resource requests to fit into its bin-packing model, adds tolerations, and modifies the security context. The running pod spec no longer matches what you defined in your YAML.
For manual kubectl apply, this is invisible. But for ArgoCD and Flux, the controller sees the running spec differs from the desired spec in Git and reports the application as “OutOfSync” perpetually. It tries to reconcile, Autopilot mutates again, and the cycle repeats.
The fix is to tell your GitOps tool to ignore the mutated fields. In ArgoCD, add this to the Application spec:
spec:
ignoreDifferences:
- group: apps
kind: Deployment
jqPathExpressions:
- .spec.template.spec.containers[].resources
- .spec.template.spec.tolerations
- .spec.template.spec.securityContext
Without this, ArgoCD will constantly show drift on every Autopilot deployment. We documented this pattern in detail in our ArgoCD GitOps guide.
Private Cluster Networking
Our Terraform config creates a private cluster where nodes have no public IPs. Pods reach the internet through Cloud NAT, and they access Google APIs through Private Google Access (enabled on the subnet). This means container images from gcr.io and Artifact Registry are pulled over Google’s internal network, not the public internet.
The master_ipv4_cidr_block = "172.16.0.0/28" reserves a /28 for the control plane’s VPC peering connection. This range must not overlap with any subnet or secondary range in your VPC. The control plane endpoint remains publicly accessible (filtered by master_authorized_networks_config) because enable_private_endpoint = false.
For a fully private setup (no public API endpoint), set enable_private_endpoint = true and access the cluster through a bastion host, Cloud Interconnect, or Cloud VPN. Keep in mind that terraform apply itself needs API access, so you would need to run it from within the VPC or through a VPN tunnel.
Autopilot Cost Model
Autopilot pricing has three components:
- Management fee: $0.10/hr ($74.40/mo), but Google gives a $74.40/mo credit per billing account. So the first Autopilot cluster’s management fee is effectively free
- Pod compute: you pay for the CPU, memory, and ephemeral storage your pods request (not what they use). In europe-west1, Autopilot vCPU is approximately $0.0445/hr and memory is $0.0049/hr per GB
- Everything else: persistent disks, load balancers, egress, and Cloud NAT are billed separately at standard GCP rates
A practical example from our test deployment: 2 nginx pods requesting 250m CPU and 512Mi memory each cost roughly $0.05/hr in compute, plus the (credited) management fee. That works out to about $36/mo for a tiny workload. Compare this to Standard mode, where a single e2-medium node costs about $25/mo but sits idle most of the time.
The real savings come from scale-to-zero. When no pods are scheduled, Autopilot removes all nodes and you pay only the management fee (which is credited). Standard mode always runs at least one node per pool. For bursty or dev/staging workloads, Autopilot can be significantly cheaper.
Troubleshooting
Error: “Autopilot can not schedule Pod: resource request too large”
Autopilot has per-pod resource limits. As of April 2026, the maximum is 222 vCPUs and 851 GB memory for regular compute classes. If you request more than the allowed maximum, the pod stays in Pending. Check the current limits in the GKE documentation. For GPU workloads, the limits are different and depend on the accelerator type.
Pods stuck in Pending with “no nodes available to schedule”
Autopilot provisions nodes on demand, so brief Pending states (30 to 120 seconds) are normal. If pods stay Pending beyond 5 minutes, check for quota exhaustion:
kubectl describe pod POD_NAME | grep -A 5 Events
Common causes: regional CPU quota reached, the requested machine shape does not exist in the selected zones, or a pod anti-affinity rule that cannot be satisfied. Check your project quotas in the GCP console under IAM & Admin > Quotas.
Error: “scheduling.k8s.io/default-scheduler: 0/1 nodes are available: Insufficient cpu”
On Autopilot, this usually means your pod’s resource requests exceed what fits on the available node shape. Autopilot bins pods into predefined machine types. If a pod requests 7 vCPUs, Autopilot must provision an 8-vCPU machine, and the remaining 1 vCPU might be consumed by system pods. Adjust your resource requests to align with common machine shapes (1, 2, 4, 8, 16 vCPU).
Terraform destroy fails with “cluster has deletion protection enabled”
GKE clusters created through the console have deletion protection enabled by default. Our Terraform config sets deletion_protection = false, but if someone toggled it in the console, Terraform cannot delete the cluster. Disable it first:
gcloud container clusters update cfg-lab-gke --region europe-west1 --no-deletion-protection
Then run terraform destroy again.
Cleanup
Destroy all resources with Terraform to avoid ongoing charges. The Cloud NAT alone costs about $0.045/hr ($32/mo) if left running:
terraform destroy -var="project_id=PROJECT_ID" -auto-approve
Verify that the cluster, VPC, and NAT gateway are gone:
gcloud container clusters list --region europe-west1 --project PROJECT_ID
gcloud compute networks list --project PROJECT_ID --filter="name=cfg-lab-vpc"
Both commands should return empty results. If the VPC deletion fails because of firewall rules auto-created by GKE, delete those first with gcloud compute firewall-rules list --filter="network=cfg-lab-vpc" and remove them. Terraform handles this in most cases, but GKE sometimes leaves behind firewall rules that were created outside the Terraform state. For teams running multiple GKE clusters, tracking these orphaned resources is critical to controlling GCP costs.
Frequently Asked Questions
Can I run DaemonSets on GKE Autopilot?
No. Autopilot does not support DaemonSets because you do not control the nodes. For logging and monitoring agents, use Google’s managed services (Cloud Logging, Cloud Monitoring) which are pre-installed. For custom agents, deploy them as sidecar containers in your application pods.
What happens to my pods if Autopilot cannot provision a node?
Pods stay in Pending state until capacity becomes available. Autopilot retries provisioning based on resource quotas and zone availability. If quotas are exhausted, the pods remain Pending indefinitely until quota is increased. There is no automatic fallback to other regions.
Is GKE Autopilot cheaper than Standard for production workloads?
It depends on utilization. Autopilot charges per pod resource request, so you pay for what you request. Standard charges per node, so underutilized nodes waste money. For workloads with high, steady utilization (above 70% of node capacity), Standard with committed use discounts is usually cheaper. For bursty, dev/staging, or variable workloads, Autopilot wins because of scale-to-zero and no idle node costs.
Can I use Spot (preemptible) pods on Autopilot?
Yes. Add the cloud.google.com/gke-spot node selector to your pod spec. Spot pods are up to 60% to 91% cheaper but can be evicted with 30 seconds notice. This is suitable for batch jobs, CI/CD runners, and fault-tolerant workloads. Add nodeSelector: {"cloud.google.com/gke-spot": "true"} and a matching toleration to your pod spec.
How does Autopilot handle Terraform drift from resource mutation?
Autopilot’s admission webhook mutates pod specs by adjusting resource requests, adding tolerations, and setting security contexts. Terraform does not manage pod specs directly (Kubernetes provider handles that), so this is not a Terraform drift issue. It affects GitOps tools like ArgoCD and Flux that compare running state to Git. The fix is configuring your GitOps tool to ignore the mutated fields in pod specs.
For deploying applications to GKE with a serverless model, see our guide on deploying Google Cloud Run with Terraform, which covers an even simpler compute model for stateless HTTP services.