Can I use Spot (preemptible) pods on GKE Autopilot?

Yes. Add the cloud.google.com/gke-spot node selector to your pod spec. Spot pods are up to 60-91% cheaper but can be evicted with 30 seconds notice. This is suitable for batch jobs, CI/CD runners, and fault-tolerant workloads.

How does GKE Autopilot handle Terraform drift from resource mutation?

Autopilot's admission webhook mutates pod specs by adjusting resource requests, adding tolerations, and setting security contexts. This affects GitOps tools like ArgoCD and Flux that compare running state to Git. The fix is configuring your GitOps tool to ignore the mutated fields in pod specs.

Deploy GKE Autopilot with Terraform: Production Setup (2026)

GKE Autopilot removes the node management overhead entirely. Google provisions, scales, and patches the nodes. You deploy pods, specify resource requests, and pay per pod (not per node). The tradeoff is less control: no DaemonSets, no privileged containers, no SSH to nodes. For most production workloads, that tradeoff is worth it.

Original content from computingforgeeks.com - post 165782

This guide provisions a production-grade Autopilot cluster with Terraform, including a custom VPC with secondary ranges, Cloud NAT for outbound traffic, private cluster networking, and Workload Identity Federation. We cover the resource mutation behavior that catches most teams off guard (especially with GitOps tools like ArgoCD), plus the real cost model with numbers from our test cluster.

Verified working: April 2026 on GKE Autopilot 1.35.1-gke.1396002, Regular channel, europe-west1, Terraform 1.12

Autopilot vs Standard: When to Choose What

The choice depends on how much node-level control you actually need. Here is a comparison based on real production usage:

Criteria	Autopilot	Standard
Node management	Fully managed by Google	You manage node pools
Pricing model	Per-pod CPU/memory/GPU + $0.10/hr management	Per-node VM cost + $0.10/hr management
Free tier	$74.40/mo management fee credit	One zonal cluster free (no management fee)
DaemonSets	Not supported	Supported
Privileged containers	Not allowed	Allowed
Node SSH	Not available	Available
Resource requests	Required on every container	Optional (but recommended)
Min pod resources	250m CPU, 512Mi memory	No minimum
Workload Identity	Always enabled, cannot disable	Optional
Scale to zero	Yes (nodes disappear when no pods)	Min 1 node per node pool
Best for	Stateless apps, APIs, batch jobs	Stateful workloads, custom node configs, GPU ML

For a detailed breakdown of GKE pricing components, including egress, persistent disks, and load balancers, see our GCP costs guide.

Prerequisites

A GCP project with billing enabled
Terraform 1.5+ and the gcloud CLI installed
container.googleapis.com and compute.googleapis.com APIs enabled
IAM permissions: roles/container.admin, roles/compute.networkAdmin, roles/iam.serviceAccountAdmin
Tested on: Terraform 1.12, google provider 6.x, GKE 1.35.1

Enable the required APIs:

gcloud services enable container.googleapis.com compute.googleapis.com

Create the VPC and Subnets

GKE Autopilot requires a VPC with secondary IP ranges for pods and services. This Terraform configuration creates the network foundation with Cloud NAT for outbound internet access (required for private clusters) and Private Google Access so pods can reach Google APIs without a public IP.

Create vpc.tf:

resource "google_compute_network" "main" {
  name                    = "cfg-lab-vpc"
  auto_create_subnetworks = false
  project                 = var.project_id
}

resource "google_compute_subnetwork" "gke" {
  name          = "cfg-lab-gke-subnet"
  ip_cidr_range = "10.10.0.0/20"
  region        = var.region
  network       = google_compute_network.main.id
  project       = var.project_id

  private_ip_google_access = true

  secondary_ip_range {
    range_name    = "pods"
    ip_cidr_range = "10.20.0.0/14"
  }

  secondary_ip_range {
    range_name    = "services"
    ip_cidr_range = "10.24.0.0/20"
  }
}

resource "google_compute_router" "router" {
  name    = "cfg-lab-router"
  region  = var.region
  network = google_compute_network.main.id
  project = var.project_id
}

resource "google_compute_router_nat" "nat" {
  name                               = "cfg-lab-nat"
  router                             = google_compute_router.router.name
  region                             = var.region
  project                            = var.project_id
  nat_ip_allocate_option             = "AUTO_ONLY"
  source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"

  log_config {
    enable = false
    filter = "ERRORS_ONLY"
  }
}

The secondary ranges are sized generously: /14 for pods gives roughly 260,000 pod IPs, and /20 for services gives 4,096. Autopilot handles IP allocation within these ranges automatically.

Create the Autopilot Cluster

Create gke.tf with the cluster definition. The key setting is enable_autopilot = true:

resource "google_container_cluster" "autopilot" {
  name     = "cfg-lab-gke"
  location = var.region
  project  = var.project_id

  enable_autopilot = true
  network          = google_compute_network.main.id
  subnetwork       = google_compute_subnetwork.gke.id

  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"
    services_secondary_range_name = "services"
  }

  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block  = "172.16.0.0/28"
  }

  release_channel {
    channel = "REGULAR"
  }

  master_authorized_networks_config {
    cidr_blocks {
      cidr_block   = "10.0.0.0/8"
      display_name = "Internal"
    }
    cidr_blocks {
      cidr_block   = "0.0.0.0/0"
      display_name = "Public access"
    }
  }

  deletion_protection = false
}

A few things to note about this configuration. Setting enable_private_endpoint = false keeps the API server accessible from the internet (filtered by master_authorized_networks_config). In a production setup, you would restrict cidr_blocks to your VPN or office IP ranges. The deletion_protection = false flag allows terraform destroy to work cleanly during testing.

Add the variables and provider configuration. Create variables.tf:

variable "project_id" {
  description = "GCP project ID"
  type        = string
}

variable "region" {
  description = "GCP region"
  type        = string
  default     = "europe-west1"
}

Create providers.tf:

terraform {
  required_version = ">= 1.5"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 6.0"
    }
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
}

Initialize and apply:

terraform init
terraform plan -var="project_id=PROJECT_ID"
terraform apply -var="project_id=PROJECT_ID" -auto-approve

The cluster takes about 8 to 10 minutes to provision. Once it is ready, configure kubectl:

gcloud container clusters get-credentials cfg-lab-gke --region europe-west1 --project PROJECT_ID

Verify the cluster is running:

kubectl cluster-info

The output confirms the cluster endpoint and CoreDNS are accessible:

Kubernetes control plane is running at https://172.16.0.2
GLBCDefaultBackend is running at https://172.16.0.2/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy
KubeDNS is running at https://172.16.0.2/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

Deploy an Application with Explicit Resources

Autopilot requires every container to declare CPU and memory requests. If you omit them, the Autopilot admission webhook injects defaults (250m CPU, 512Mi memory), which is usually not what you want. Always set them explicitly.

kubectl apply -f - <<'EOF'
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-demo
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx-demo
  template:
    metadata:
      labels:
        app: nginx-demo
    spec:
      containers:
      - name: nginx
        image: nginx:1.27
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 250m
            memory: 512Mi
          limits:
            cpu: 500m
            memory: 1Gi
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-demo
  namespace: default
spec:
  type: ClusterIP
  selector:
    app: nginx-demo
  ports:
  - port: 80
    targetPort: 80
EOF

Autopilot provisions a node when pods are scheduled. On our test cluster, the node appeared in about 90 seconds:

kubectl get nodes

The node was provisioned on demand:

NAME                                          STATUS   ROLES    AGE   VERSION
gk3-cfg-lab-gke-pool-1-1d05a4de-pbmt          Ready    <none>   92s   v1.35.1-gke.1396002

Confirm the pods are running:

kubectl get pods -l app=nginx-demo

Both replicas should show Running:

NAME                          READY   STATUS    RESTARTS   AGE
nginx-demo-7d4f8b6c9f-k8x2p  1/1     Running   0          2m
nginx-demo-7d4f8b6c9f-m3n7q  1/1     Running   0          2m

Horizontal Pod Autoscaler

HPA works on Autopilot the same way as Standard. When HPA scales up and new pods cannot fit on existing nodes, Autopilot provisions additional nodes automatically.

kubectl autoscale deployment nginx-demo --cpu-percent=50 --min=2 --max=5

Check the HPA status:

kubectl get hpa

The initial CPU metric shows as unknown until the metrics server collects data (usually within 60 seconds):

NAME         REFERENCE               TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
nginx-demo   Deployment/nginx-demo   cpu: <unknown>/50%   2         5         2          15s

After a minute, the actual CPU utilization appears. The key difference from Standard mode is that Autopilot handles the node scaling transparently. You never create or configure a cluster autoscaler because it is built into the Autopilot control plane.

Workload Identity Federation

On Autopilot, Workload Identity is always enabled and cannot be turned off. The workload identity pool for our test cluster is PROJECT_ID.svc.id.goog. This means every pod automatically gets a Kubernetes service account identity that can be mapped to a GCP IAM service account.

To grant a pod access to GCP resources (Cloud Storage, BigQuery, Pub/Sub), bind a Kubernetes service account to a GCP service account:

gcloud iam service-accounts create app-sa --project=PROJECT_ID

gcloud iam service-accounts add-iam-policy-binding app-sa@PROJECT_ID.iam.gserviceaccount.com \
  --role roles/iam.workloadIdentityUser \
  --member "serviceAccount:PROJECT_ID.svc.id.goog[default/app-ksa]" \
  --project=PROJECT_ID

Then annotate the Kubernetes service account:

kubectl create serviceaccount app-ksa
kubectl annotate serviceaccount app-ksa \
  iam.gke.io/gcp-service-account=app-sa@PROJECT_ID.iam.gserviceaccount.com

Reference the service account in your Pod spec with serviceAccountName: app-ksa. For a deeper walkthrough of Workload Identity, including testing and debugging, see our GKE Workload Identity Federation guide.

The Autopilot Resource Mutation Trap

This catches almost every team that deploys to Autopilot with GitOps. When you submit a pod spec, Autopilot’s admission webhook mutates it. It adjusts resource requests to fit into its bin-packing model, adds tolerations, and modifies the security context. The running pod spec no longer matches what you defined in your YAML.

For manual kubectl apply, this is invisible. But for ArgoCD and Flux, the controller sees the running spec differs from the desired spec in Git and reports the application as “OutOfSync” perpetually. It tries to reconcile, Autopilot mutates again, and the cycle repeats.

The fix is to tell your GitOps tool to ignore the mutated fields. In ArgoCD, add this to the Application spec:

spec:
  ignoreDifferences:
  - group: apps
    kind: Deployment
    jqPathExpressions:
    - .spec.template.spec.containers[].resources
    - .spec.template.spec.tolerations
    - .spec.template.spec.securityContext

Without this, ArgoCD will constantly show drift on every Autopilot deployment. We documented this pattern in detail in our ArgoCD GitOps guide.

Private Cluster Networking

Our Terraform config creates a private cluster where nodes have no public IPs. Pods reach the internet through Cloud NAT, and they access Google APIs through Private Google Access (enabled on the subnet). This means container images from gcr.io and Artifact Registry are pulled over Google’s internal network, not the public internet.

The master_ipv4_cidr_block = "172.16.0.0/28" reserves a /28 for the control plane’s VPC peering connection. This range must not overlap with any subnet or secondary range in your VPC. The control plane endpoint remains publicly accessible (filtered by master_authorized_networks_config) because enable_private_endpoint = false.

For a fully private setup (no public API endpoint), set enable_private_endpoint = true and access the cluster through a bastion host, Cloud Interconnect, or Cloud VPN. Keep in mind that terraform apply itself needs API access, so you would need to run it from within the VPC or through a VPN tunnel.

Autopilot Cost Model

Autopilot pricing has three components:

Management fee: $0.10/hr ($74.40/mo), but Google gives a $74.40/mo credit per billing account. So the first Autopilot cluster’s management fee is effectively free
Pod compute: you pay for the CPU, memory, and ephemeral storage your pods request (not what they use). In europe-west1, Autopilot vCPU is approximately $0.0445/hr and memory is $0.0049/hr per GB
Everything else: persistent disks, load balancers, egress, and Cloud NAT are billed separately at standard GCP rates

A practical example from our test deployment: 2 nginx pods requesting 250m CPU and 512Mi memory each cost roughly $0.05/hr in compute, plus the (credited) management fee. That works out to about $36/mo for a tiny workload. Compare this to Standard mode, where a single e2-medium node costs about $25/mo but sits idle most of the time.

The real savings come from scale-to-zero. When no pods are scheduled, Autopilot removes all nodes and you pay only the management fee (which is credited). Standard mode always runs at least one node per pool. For bursty or dev/staging workloads, Autopilot can be significantly cheaper.

Troubleshooting

Error: “Autopilot can not schedule Pod: resource request too large”

Autopilot has per-pod resource limits. As of April 2026, the maximum is 222 vCPUs and 851 GB memory for regular compute classes. If you request more than the allowed maximum, the pod stays in Pending. Check the current limits in the GKE documentation. For GPU workloads, the limits are different and depend on the accelerator type.

Pods stuck in Pending with “no nodes available to schedule”

Autopilot provisions nodes on demand, so brief Pending states (30 to 120 seconds) are normal. If pods stay Pending beyond 5 minutes, check for quota exhaustion:

kubectl describe pod POD_NAME | grep -A 5 Events

Common causes: regional CPU quota reached, the requested machine shape does not exist in the selected zones, or a pod anti-affinity rule that cannot be satisfied. Check your project quotas in the GCP console under IAM & Admin > Quotas.

Error: “scheduling.k8s.io/default-scheduler: 0/1 nodes are available: Insufficient cpu”

On Autopilot, this usually means your pod’s resource requests exceed what fits on the available node shape. Autopilot bins pods into predefined machine types. If a pod requests 7 vCPUs, Autopilot must provision an 8-vCPU machine, and the remaining 1 vCPU might be consumed by system pods. Adjust your resource requests to align with common machine shapes (1, 2, 4, 8, 16 vCPU).

Terraform destroy fails with “cluster has deletion protection enabled”

GKE clusters created through the console have deletion protection enabled by default. Our Terraform config sets deletion_protection = false, but if someone toggled it in the console, Terraform cannot delete the cluster. Disable it first:

gcloud container clusters update cfg-lab-gke --region europe-west1 --no-deletion-protection

Then run terraform destroy again.

Cleanup

Destroy all resources with Terraform to avoid ongoing charges. The Cloud NAT alone costs about $0.045/hr ($32/mo) if left running:

terraform destroy -var="project_id=PROJECT_ID" -auto-approve

Verify that the cluster, VPC, and NAT gateway are gone:

gcloud container clusters list --region europe-west1 --project PROJECT_ID
gcloud compute networks list --project PROJECT_ID --filter="name=cfg-lab-vpc"

Both commands should return empty results. If the VPC deletion fails because of firewall rules auto-created by GKE, delete those first with gcloud compute firewall-rules list --filter="network=cfg-lab-vpc" and remove them. Terraform handles this in most cases, but GKE sometimes leaves behind firewall rules that were created outside the Terraform state. For teams running multiple GKE clusters, tracking these orphaned resources is critical to controlling GCP costs.

Frequently Asked Questions

Can I run DaemonSets on GKE Autopilot?

No. Autopilot does not support DaemonSets because you do not control the nodes. For logging and monitoring agents, use Google’s managed services (Cloud Logging, Cloud Monitoring) which are pre-installed. For custom agents, deploy them as sidecar containers in your application pods.

What happens to my pods if Autopilot cannot provision a node?

Pods stay in Pending state until capacity becomes available. Autopilot retries provisioning based on resource quotas and zone availability. If quotas are exhausted, the pods remain Pending indefinitely until quota is increased. There is no automatic fallback to other regions.

Is GKE Autopilot cheaper than Standard for production workloads?

It depends on utilization. Autopilot charges per pod resource request, so you pay for what you request. Standard charges per node, so underutilized nodes waste money. For workloads with high, steady utilization (above 70% of node capacity), Standard with committed use discounts is usually cheaper. For bursty, dev/staging, or variable workloads, Autopilot wins because of scale-to-zero and no idle node costs.

Can I use Spot (preemptible) pods on Autopilot?

Yes. Add the cloud.google.com/gke-spot node selector to your pod spec. Spot pods are up to 60% to 91% cheaper but can be evicted with 30 seconds notice. This is suitable for batch jobs, CI/CD runners, and fault-tolerant workloads. Add nodeSelector: {"cloud.google.com/gke-spot": "true"} and a matching toleration to your pod spec.

How does Autopilot handle Terraform drift from resource mutation?

Autopilot’s admission webhook mutates pod specs by adjusting resource requests, adding tolerations, and setting security contexts. Terraform does not manage pod specs directly (Kubernetes provider handles that), so this is not a Terraform drift issue. It affects GitOps tools like ArgoCD and Flux that compare running state to Git. The fix is configuring your GitOps tool to ignore the mutated fields in pod specs.

For deploying applications to GKE with a serverless model, see our guide on deploying Google Cloud Run with Terraform, which covers an even simpler compute model for stateless HTTP services.