Deploy to Google Cloud Run with Terraform (2026 Guide)

Q: Does Cloud Run support WebSockets?

Yes. Cloud Run supports WebSockets, gRPC, and SSE. Request timeout is configurable up to 60 minutes. Use --session-affinity for long-lived connections.

Q: How do I connect Cloud Run to Cloud SQL on a private IP?

Use a Serverless VPC Access connector to bridge Cloud Run into your VPC. The legacy approach is the Cloud SQL Auth Proxy sidecar. VPC connector is simpler.

Q: Can Cloud Run pull secrets from Secret Manager?

Yes. Use --set-secrets to inject Secret Manager values as env vars or mounted files. The Cloud Run service agent needs roles/secretmanager.secretAccessor on the target secret.

Q: What is the AWS equivalent of Cloud Run?

ECS Fargate behind an ALB. Cloud Run is simpler (one command vs Fargate + ALB + target groups), includes HTTPS with managed certs, and has built-in traffic splitting. For scale-to-zero, also comparable to Lambda container images.

Q: How fast is the cold start?

With a small Go image, 500ms to 1.5s. Larger Python/Java images 3-8s. Set --min-instances=1 to eliminate cold starts for latency-sensitive services.

Cloud Run is Google’s answer to “I have a container and I want it on the internet in ninety seconds, with HTTPS, with autoscaling, and without thinking about servers.” You push an image, Cloud Run provisions the infrastructure, scales to zero when nobody is calling it, scales up when traffic arrives, and you pay per request plus per second of compute time actually consumed. No Kubernetes cluster to manage, no node pools to size, no idle VMs burning money overnight. For many workloads that would have lived on a small GKE cluster or a Compute Engine VM behind a load balancer, Cloud Run is the simpler and cheaper answer.

Original content from computingforgeeks.com - post 165766

This guide walks through building a container image with Cloud Build (no local Docker required), pushing it to Artifact Registry, deploying to Cloud Run via gcloud, testing canary releases with tagged revisions and traffic splitting, the full Terraform equivalent for teams who want everything in code, environment variables and secrets integration, VPC connectors for private backend access, the cost model so you know exactly what you are paying for, and cleanup. Every command was run on a real GCP project in europe-west1 with the real output captured. If you also run workloads on AWS, the closest AWS equivalent is a Fargate-backed ECS service behind an ALB, which needs about ten times more configuration to achieve the same result.

Tested April 2026 on Google Cloud Run (managed) in europe-west1 with gcloud 521 and Cloud Build

Cloud Run vs GKE vs Cloud Functions: When to Pick Cloud Run

Before deploying anything, settle whether Cloud Run is the right fit. Three GCP compute products overlap, and the decision depends on your workload shape.

Dimension	Cloud Run	GKE Autopilot	Cloud Functions
Deployment unit	Container image	Container image (pods)	Source code (zip/repo)
Scale to zero	Yes (default)	No (minimum 1 pod unless using scale-to-zero add-ons)	Yes
Cold start	~500ms to 2s (first request)	No cold start (pods always warm)	~200ms to 5s depending on runtime
Request timeout	Up to 60 minutes	No limit	Up to 60 minutes (v2)
Concurrency	Up to 1000 concurrent requests per instance	Pod-level, no per-instance limit	1 request per instance (v1), 1000 (v2)
Persistent connections	WebSockets, gRPC, SSE supported	Full Kubernetes networking	HTTP only (v1), gRPC (v2)
Pricing model	Per request + per second of vCPU/memory	Per vCPU/memory/second (always on)	Per invocation + per second of vCPU/memory
Best for	HTTP APIs, web apps, microservices, async workers	Stateful workloads, complex networking, multi-container pods	Event handlers, webhooks, small transformations

Cloud Run is the default pick for any containerized workload that responds to HTTP, gRPC, or event triggers and does not need persistent local state. GKE wins when you need sidecar containers, service mesh, StatefulSets, or the full Kubernetes API surface. Cloud Functions is for the small event-driven jobs where you do not want to think about containers at all. For an overview of all three with a real cost comparison, see our GCP Costs Explained guide.

Prerequisites

A GCP project with billing enabled
gcloud CLI 520+ authenticated (tested on v521.0.0)
Cloud Run, Cloud Build, and Artifact Registry APIs enabled

Enable the APIs in one shot:

gcloud services enable \
  run.googleapis.com \
  cloudbuild.googleapis.com \
  artifactregistry.googleapis.com \
  --project=PROJECT_ID

Build the Container Image with Cloud Build

You do not need Docker installed locally. Cloud Build runs in the cloud, builds the image from your source, and pushes it straight to Artifact Registry. First, create an Artifact Registry repository to hold the images:

gcloud artifacts repositories create my-app-repo \
  --repository-format=docker \
  --location=europe-west1

Write a minimal Go application. The only requirement Cloud Run has is that the container listens on the port specified by the PORT environment variable (defaults to 8080):

vim main.go

package main

import (
	"fmt"
	"net/http"
	"os"
)

func main() {
	port := os.Getenv("PORT")
	if port == "" {
		port = "8080"
	}
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintf(w, "Hello from Cloud Run! Version: 1.0.0\n")
	})
	http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
		w.WriteHeader(200)
		fmt.Fprintf(w, "ok\n")
	})
	fmt.Printf("Listening on port %s\n", port)
	http.ListenAndServe(":"+port, nil)
}

The Dockerfile uses a multi-stage build that produces a distroless image under 10 MB. Smaller images mean faster cold starts on Cloud Run because the image pull happens on every scale-from-zero event:

vim Dockerfile

FROM golang:1.24-alpine AS build
WORKDIR /app
COPY main.go .
RUN CGO_ENABLED=0 go build -o server main.go

FROM gcr.io/distroless/static-debian12
COPY --from=build /app/server /server
CMD ["/server"]

Submit the build. Cloud Build uploads the source, runs the Dockerfile, and pushes the resulting image to Artifact Registry. The entire process happens in GCP, no local Docker daemon involved:

gcloud builds submit \
  --tag europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1

Build output ends with a SUCCESS line and the full image path:

ID                                    CREATE_TIME                DURATION  SOURCE     IMAGES                                                                STATUS
ad1faee9-0d2f-4171-b177-5e7617a2debf  2026-04-12T08:20:06+00:00  1M8S      gs://...   europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1          SUCCESS

Deploy to Cloud Run

One command creates the service, provisions the HTTPS endpoint, and routes traffic:

gcloud run deploy hello-app \
  --image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
  --platform=managed \
  --region=europe-west1 \
  --port=8080 \
  --allow-unauthenticated \
  --min-instances=0 \
  --max-instances=3 \
  --memory=256Mi \
  --cpu=1

Key flags worth understanding. --allow-unauthenticated makes the endpoint public (skip this for internal APIs and add IAM invoker bindings instead). --min-instances=0 enables scale-to-zero, which is the default and what makes Cloud Run cheap for low-traffic services. --max-instances=3 caps the horizontal scale so a traffic spike does not produce an unexpected bill. Deployment output:

Service [hello-app] revision [hello-app-00001-9xv] has been deployed and is serving 100 percent of traffic.
Service URL: https://hello-app-PROJECT_NUMBER.europe-west1.run.app

Test the endpoint:

curl https://hello-app-PROJECT_NUMBER.europe-west1.run.app/

Response from the live service:

Hello from Cloud Run! Version: 1.0.0

HTTPS with a valid Google-managed certificate, autoscaling, and a public URL. Total time from gcloud run deploy to a responding endpoint: about thirty seconds.

Canary Deployments with Tagged Revisions

Cloud Run’s traffic splitting is one of its strongest features and it works without any external tool. Each deploy creates a new revision. You can tag revisions and route a percentage of traffic to each, which is the simplest canary deployment pattern in any cloud. Update the app to v2 and build a new image:

gcloud builds submit \
  --tag europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v2

Deploy v2 without routing any traffic to it yet. The --no-traffic flag creates the revision but leaves all traffic on v1. The --tag=canary flag assigns a named tag so you can address the revision directly via a dedicated URL:

gcloud run deploy hello-app \
  --image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v2 \
  --region=europe-west1 \
  --no-traffic \
  --tag=canary

The canary revision gets its own URL that you can test before sending real traffic to it:

The revision can be reached directly at https://canary---hello-app-4humrjh2wq-ew.a.run.app

Test the canary endpoint and verify v2 responds correctly:

curl https://canary---hello-app-4humrjh2wq-ew.a.run.app/

Hello from Cloud Run! Version: 2.0.0 (canary)

Now split traffic. Send 80% to v1 and 20% to the canary:

gcloud run services update-traffic hello-app \
  --region=europe-west1 \
  --to-tags=canary=20

The traffic configuration is now:

Traffic:
  80% hello-app-00001-9xv
  20% hello-app-00002-zuh
       canary: https://canary---hello-app-4humrjh2wq-ew.a.run.app

If the canary looks good, promote it to 100%:

gcloud run services update-traffic hello-app \
  --region=europe-west1 \
  --to-latest

If the canary is broken, roll back by sending all traffic to v1:

gcloud run services update-traffic hello-app \
  --region=europe-west1 \
  --to-revisions=hello-app-00001-9xv=100

The rollback is instant. No new build, no redeploy, just a traffic routing change that takes effect in under two seconds. This is faster than any Kubernetes rollback because there is no pod scheduling involved.

Environment Variables and Secrets

Most real applications need configuration (database URLs, API keys, feature flags). Cloud Run injects environment variables at deploy time and can mount secrets from Google Cloud Secret Manager as either environment variables or mounted files:

gcloud run deploy hello-app \
  --image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
  --region=europe-west1 \
  --set-env-vars="APP_ENV=production,LOG_LEVEL=info" \
  --set-secrets="DB_PASSWORD=db-password:latest"

The --set-secrets flag pulls the value from Secret Manager at runtime and injects it as an environment variable. The format is ENV_VAR_NAME=SECRET_NAME:VERSION. The Cloud Run service agent must have roles/secretmanager.secretAccessor on the secret. For production, prefer specific version numbers over latest to avoid the alias gotcha documented in our Secret Manager tutorial.

VPC Connector: Access Private Resources

By default Cloud Run services can only reach the public internet. If your service needs to talk to a Cloud SQL instance on a private IP, a Memorystore Redis cluster, or any VPC-internal resource, you need a Serverless VPC Access connector. The connector bridges Cloud Run into your VPC:

gcloud compute networks vpc-access connectors create run-connector \
  --region=europe-west1 \
  --network=default \
  --range=10.8.0.0/28 \
  --min-instances=2 \
  --max-instances=3

Then attach the connector to the service:

gcloud run deploy hello-app \
  --image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
  --region=europe-west1 \
  --vpc-connector=run-connector \
  --vpc-egress=private-ranges-only

The --vpc-egress=private-ranges-only flag routes RFC1918 traffic through the connector and everything else directly to the internet. Use all-traffic if you also want public internet egress to go through your VPC (for Cloud NAT or firewall control). The connector has a small monthly cost ($0.01 per hour per instance, so roughly $15 to $22 per month for a 2-3 instance connector). Free tier does not cover it.

Terraform Version

The full Cloud Run deployment in Terraform for teams who want everything in code. Uses the google_cloud_run_v2_service resource which is the current v2 API:

vim cloudrun.tf

resource "google_artifact_registry_repository" "app" {
  location      = var.region
  repository_id = "my-app-repo"
  format        = "DOCKER"
}

resource "google_cloud_run_v2_service" "hello" {
  name     = "hello-app"
  location = var.region

  deletion_protection = false

  template {
    containers {
      image = "${var.region}-docker.pkg.dev/${var.project_id}/${google_artifact_registry_repository.app.repository_id}/hello-app:v1"

      ports {
        container_port = 8080
      }

      resources {
        limits = {
          cpu    = "1"
          memory = "256Mi"
        }
      }
    }

    scaling {
      min_instance_count = 0
      max_instance_count = 3
    }
  }

  traffic {
    type    = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
    percent = 100
  }
}

resource "google_cloud_run_v2_service_iam_member" "public" {
  name     = google_cloud_run_v2_service.hello.name
  location = google_cloud_run_v2_service.hello.location
  role     = "roles/run.invoker"
  member   = "allUsers"
}

output "service_url" {
  value = google_cloud_run_v2_service.hello.uri
}

Apply with terraform init && terraform apply -var project_id=PROJECT_ID. The important detail is deletion_protection = false. Without it, terraform destroy refuses to delete the service, which is a safety feature in production but annoying in labs. The google_cloud_run_v2_service_iam_member resource with allUsers is what makes the endpoint publicly accessible, equivalent to the --allow-unauthenticated flag in gcloud.

What Cloud Run Actually Costs

Cloud Run pricing has three components, and a generous free tier that covers most demos and low-traffic production APIs forever.

Component	Price (europe-west1)	Free tier (monthly)
CPU	$0.00002400 per vCPU-second	180,000 vCPU-seconds
Memory	$0.00000250 per GiB-second	360,000 GiB-seconds
Requests	$0.40 per million	2 million requests

The free tier translates to roughly: an app with 1 vCPU and 256 MiB serving 2 million requests per month with each request averaging 90ms of CPU time costs nothing. Past the free tier, the same workload costs single-digit dollars per month. Cloud Run only bills when a request is being processed (or when min-instances > 0, in which case idle instances bill at a reduced CPU rate of 10% of the active rate). For a detailed breakdown of all the GCP cost traps including Cloud Run’s connector costs, see our GCP Costs Explained guide.

Cleanup

Cloud Run services with min-instances=0 cost nothing when idle, but the Artifact Registry repository stores images that bill for storage. Clean everything up:

gcloud run services delete hello-app --region=europe-west1 --quiet
gcloud artifacts repositories delete my-app-repo --location=europe-west1 --quiet

Both deletions are instant. No cluster to tear down, no nodes to drain, no NAT gateway to wait for.

FAQ

Does Cloud Run support WebSockets?

Yes. Cloud Run supports WebSockets, gRPC, and server-sent events (SSE) on the managed platform. The request timeout is configurable up to 60 minutes. For long-lived connections like WebSockets, set --session-affinity so subsequent requests from the same client hit the same instance, and increase the timeout accordingly.

How do I connect Cloud Run to Cloud SQL on a private IP?

Two options. The recommended approach is a Serverless VPC Access connector that bridges Cloud Run into your VPC. The legacy approach is the Cloud SQL Auth Proxy sidecar, which Cloud Run supports as a second container in the service spec. The VPC connector is simpler and avoids the Auth Proxy overhead. For the full Cloud SQL setup, see our Cloud SQL PostgreSQL guide.

Can Cloud Run pull secrets from Secret Manager?

Yes. Use the --set-secrets flag to inject Secret Manager values as environment variables or mounted files at deploy time. The Cloud Run service agent needs roles/secretmanager.secretAccessor on the target secret. Pin to a specific version number rather than latest to avoid the alias gotcha documented in our Secret Manager tutorial.

What is the AWS equivalent of Cloud Run?

The closest match is an ECS Fargate service behind an Application Load Balancer. Both run containers without managing servers. Cloud Run is significantly simpler to set up (one gcloud run deploy command vs Fargate + ALB + target groups + task definition + service), has built-in HTTPS with managed certificates, and includes traffic splitting for canary deployments. Fargate offers more networking flexibility and integrates deeper with the AWS ecosystem. For workloads that need scale-to-zero, Cloud Run’s model is also comparable to AWS Lambda container images, though Lambda has a 15-minute execution timeout while Cloud Run allows up to 60 minutes.

How fast is the cold start?

With a distroless Go image under 10 MB, cold starts typically range from 500ms to 1.5 seconds including image pull and container initialization. Larger images (Python with dependencies, Java with Spring Boot) can take 3 to 8 seconds on a cold start. To eliminate cold starts for latency-sensitive services, set --min-instances=1 and accept the always-on cost for that one instance.

Where to Go Next

Cloud Run is usually the starting point for containerized workloads on GCP because it removes the infrastructure overhead entirely. The natural next steps are connecting it to a database (Secret Manager for credentials, Cloud SQL Auth Proxy or VPC connector for the connection), setting up CI/CD with GitHub Actions and Workload Identity Federation so deploys happen on push without JSON keys, and adding a custom domain with a Google-managed SSL certificate. When the workload outgrows Cloud Run (you need sidecars, StatefulSets, or the full Kubernetes API), the upgrade path is GKE Autopilot with the same container image. The official Cloud Run documentation is the reference worth bookmarking.