Cloud Run is Google’s answer to “I have a container and I want it on the internet in ninety seconds, with HTTPS, with autoscaling, and without thinking about servers.” You push an image, Cloud Run provisions the infrastructure, scales to zero when nobody is calling it, scales up when traffic arrives, and you pay per request plus per second of compute time actually consumed. No Kubernetes cluster to manage, no node pools to size, no idle VMs burning money overnight. For many workloads that would have lived on a small GKE cluster or a Compute Engine VM behind a load balancer, Cloud Run is the simpler and cheaper answer.
This guide walks through building a container image with Cloud Build (no local Docker required), pushing it to Artifact Registry, deploying to Cloud Run via gcloud, testing canary releases with tagged revisions and traffic splitting, the full Terraform equivalent for teams who want everything in code, environment variables and secrets integration, VPC connectors for private backend access, the cost model so you know exactly what you are paying for, and cleanup. Every command was run on a real GCP project in europe-west1 with the real output captured. If you also run workloads on AWS, the closest AWS equivalent is a Fargate-backed ECS service behind an ALB, which needs about ten times more configuration to achieve the same result.
Tested April 2026 on Google Cloud Run (managed) in europe-west1 with gcloud 521 and Cloud Build
Cloud Run vs GKE vs Cloud Functions: When to Pick Cloud Run
Before deploying anything, settle whether Cloud Run is the right fit. Three GCP compute products overlap, and the decision depends on your workload shape.
| Dimension | Cloud Run | GKE Autopilot | Cloud Functions |
|---|---|---|---|
| Deployment unit | Container image | Container image (pods) | Source code (zip/repo) |
| Scale to zero | Yes (default) | No (minimum 1 pod unless using scale-to-zero add-ons) | Yes |
| Cold start | ~500ms to 2s (first request) | No cold start (pods always warm) | ~200ms to 5s depending on runtime |
| Request timeout | Up to 60 minutes | No limit | Up to 60 minutes (v2) |
| Concurrency | Up to 1000 concurrent requests per instance | Pod-level, no per-instance limit | 1 request per instance (v1), 1000 (v2) |
| Persistent connections | WebSockets, gRPC, SSE supported | Full Kubernetes networking | HTTP only (v1), gRPC (v2) |
| Pricing model | Per request + per second of vCPU/memory | Per vCPU/memory/second (always on) | Per invocation + per second of vCPU/memory |
| Best for | HTTP APIs, web apps, microservices, async workers | Stateful workloads, complex networking, multi-container pods | Event handlers, webhooks, small transformations |
Cloud Run is the default pick for any containerized workload that responds to HTTP, gRPC, or event triggers and does not need persistent local state. GKE wins when you need sidecar containers, service mesh, StatefulSets, or the full Kubernetes API surface. Cloud Functions is for the small event-driven jobs where you do not want to think about containers at all. For an overview of all three with a real cost comparison, see our GCP Costs Explained guide.
Prerequisites
- A GCP project with billing enabled
gcloudCLI 520+ authenticated (tested on v521.0.0)- Cloud Run, Cloud Build, and Artifact Registry APIs enabled
Enable the APIs in one shot:
gcloud services enable \
run.googleapis.com \
cloudbuild.googleapis.com \
artifactregistry.googleapis.com \
--project=PROJECT_ID
Build the Container Image with Cloud Build
You do not need Docker installed locally. Cloud Build runs in the cloud, builds the image from your source, and pushes it straight to Artifact Registry. First, create an Artifact Registry repository to hold the images:
gcloud artifacts repositories create my-app-repo \
--repository-format=docker \
--location=europe-west1
Write a minimal Go application. The only requirement Cloud Run has is that the container listens on the port specified by the PORT environment variable (defaults to 8080):
vim main.go
package main
import (
"fmt"
"net/http"
"os"
)
func main() {
port := os.Getenv("PORT")
if port == "" {
port = "8080"
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "Hello from Cloud Run! Version: 1.0.0\n")
})
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(200)
fmt.Fprintf(w, "ok\n")
})
fmt.Printf("Listening on port %s\n", port)
http.ListenAndServe(":"+port, nil)
}
The Dockerfile uses a multi-stage build that produces a distroless image under 10 MB. Smaller images mean faster cold starts on Cloud Run because the image pull happens on every scale-from-zero event:
vim Dockerfile
FROM golang:1.24-alpine AS build
WORKDIR /app
COPY main.go .
RUN CGO_ENABLED=0 go build -o server main.go
FROM gcr.io/distroless/static-debian12
COPY --from=build /app/server /server
CMD ["/server"]
Submit the build. Cloud Build uploads the source, runs the Dockerfile, and pushes the resulting image to Artifact Registry. The entire process happens in GCP, no local Docker daemon involved:
gcloud builds submit \
--tag europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1
Build output ends with a SUCCESS line and the full image path:
ID CREATE_TIME DURATION SOURCE IMAGES STATUS
ad1faee9-0d2f-4171-b177-5e7617a2debf 2026-04-12T08:20:06+00:00 1M8S gs://... europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 SUCCESS
Deploy to Cloud Run
One command creates the service, provisions the HTTPS endpoint, and routes traffic:
gcloud run deploy hello-app \
--image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
--platform=managed \
--region=europe-west1 \
--port=8080 \
--allow-unauthenticated \
--min-instances=0 \
--max-instances=3 \
--memory=256Mi \
--cpu=1
Key flags worth understanding. --allow-unauthenticated makes the endpoint public (skip this for internal APIs and add IAM invoker bindings instead). --min-instances=0 enables scale-to-zero, which is the default and what makes Cloud Run cheap for low-traffic services. --max-instances=3 caps the horizontal scale so a traffic spike does not produce an unexpected bill. Deployment output:
Service [hello-app] revision [hello-app-00001-9xv] has been deployed and is serving 100 percent of traffic.
Service URL: https://hello-app-PROJECT_NUMBER.europe-west1.run.app
Test the endpoint:
curl https://hello-app-PROJECT_NUMBER.europe-west1.run.app/
Response from the live service:
Hello from Cloud Run! Version: 1.0.0
HTTPS with a valid Google-managed certificate, autoscaling, and a public URL. Total time from gcloud run deploy to a responding endpoint: about thirty seconds.
Canary Deployments with Tagged Revisions
Cloud Run’s traffic splitting is one of its strongest features and it works without any external tool. Each deploy creates a new revision. You can tag revisions and route a percentage of traffic to each, which is the simplest canary deployment pattern in any cloud. Update the app to v2 and build a new image:
gcloud builds submit \
--tag europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v2
Deploy v2 without routing any traffic to it yet. The --no-traffic flag creates the revision but leaves all traffic on v1. The --tag=canary flag assigns a named tag so you can address the revision directly via a dedicated URL:
gcloud run deploy hello-app \
--image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v2 \
--region=europe-west1 \
--no-traffic \
--tag=canary
The canary revision gets its own URL that you can test before sending real traffic to it:
The revision can be reached directly at https://canary---hello-app-4humrjh2wq-ew.a.run.app
Test the canary endpoint and verify v2 responds correctly:
curl https://canary---hello-app-4humrjh2wq-ew.a.run.app/
Hello from Cloud Run! Version: 2.0.0 (canary)
Now split traffic. Send 80% to v1 and 20% to the canary:
gcloud run services update-traffic hello-app \
--region=europe-west1 \
--to-tags=canary=20
The traffic configuration is now:
Traffic:
80% hello-app-00001-9xv
20% hello-app-00002-zuh
canary: https://canary---hello-app-4humrjh2wq-ew.a.run.app
If the canary looks good, promote it to 100%:
gcloud run services update-traffic hello-app \
--region=europe-west1 \
--to-latest
If the canary is broken, roll back by sending all traffic to v1:
gcloud run services update-traffic hello-app \
--region=europe-west1 \
--to-revisions=hello-app-00001-9xv=100
The rollback is instant. No new build, no redeploy, just a traffic routing change that takes effect in under two seconds. This is faster than any Kubernetes rollback because there is no pod scheduling involved.
Environment Variables and Secrets
Most real applications need configuration (database URLs, API keys, feature flags). Cloud Run injects environment variables at deploy time and can mount secrets from Google Cloud Secret Manager as either environment variables or mounted files:
gcloud run deploy hello-app \
--image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
--region=europe-west1 \
--set-env-vars="APP_ENV=production,LOG_LEVEL=info" \
--set-secrets="DB_PASSWORD=db-password:latest"
The --set-secrets flag pulls the value from Secret Manager at runtime and injects it as an environment variable. The format is ENV_VAR_NAME=SECRET_NAME:VERSION. The Cloud Run service agent must have roles/secretmanager.secretAccessor on the secret. For production, prefer specific version numbers over latest to avoid the alias gotcha documented in our Secret Manager tutorial.
VPC Connector: Access Private Resources
By default Cloud Run services can only reach the public internet. If your service needs to talk to a Cloud SQL instance on a private IP, a Memorystore Redis cluster, or any VPC-internal resource, you need a Serverless VPC Access connector. The connector bridges Cloud Run into your VPC:
gcloud compute networks vpc-access connectors create run-connector \
--region=europe-west1 \
--network=default \
--range=10.8.0.0/28 \
--min-instances=2 \
--max-instances=3
Then attach the connector to the service:
gcloud run deploy hello-app \
--image=europe-west1-docker.pkg.dev/PROJECT_ID/my-app-repo/hello-app:v1 \
--region=europe-west1 \
--vpc-connector=run-connector \
--vpc-egress=private-ranges-only
The --vpc-egress=private-ranges-only flag routes RFC1918 traffic through the connector and everything else directly to the internet. Use all-traffic if you also want public internet egress to go through your VPC (for Cloud NAT or firewall control). The connector has a small monthly cost ($0.01 per hour per instance, so roughly $15 to $22 per month for a 2-3 instance connector). Free tier does not cover it.
Terraform Version
The full Cloud Run deployment in Terraform for teams who want everything in code. Uses the google_cloud_run_v2_service resource which is the current v2 API:
vim cloudrun.tf
resource "google_artifact_registry_repository" "app" {
location = var.region
repository_id = "my-app-repo"
format = "DOCKER"
}
resource "google_cloud_run_v2_service" "hello" {
name = "hello-app"
location = var.region
deletion_protection = false
template {
containers {
image = "${var.region}-docker.pkg.dev/${var.project_id}/${google_artifact_registry_repository.app.repository_id}/hello-app:v1"
ports {
container_port = 8080
}
resources {
limits = {
cpu = "1"
memory = "256Mi"
}
}
}
scaling {
min_instance_count = 0
max_instance_count = 3
}
}
traffic {
type = "TRAFFIC_TARGET_ALLOCATION_TYPE_LATEST"
percent = 100
}
}
resource "google_cloud_run_v2_service_iam_member" "public" {
name = google_cloud_run_v2_service.hello.name
location = google_cloud_run_v2_service.hello.location
role = "roles/run.invoker"
member = "allUsers"
}
output "service_url" {
value = google_cloud_run_v2_service.hello.uri
}
Apply with terraform init && terraform apply -var project_id=PROJECT_ID. The important detail is deletion_protection = false. Without it, terraform destroy refuses to delete the service, which is a safety feature in production but annoying in labs. The google_cloud_run_v2_service_iam_member resource with allUsers is what makes the endpoint publicly accessible, equivalent to the --allow-unauthenticated flag in gcloud.
What Cloud Run Actually Costs
Cloud Run pricing has three components, and a generous free tier that covers most demos and low-traffic production APIs forever.
| Component | Price (europe-west1) | Free tier (monthly) |
|---|---|---|
| CPU | $0.00002400 per vCPU-second | 180,000 vCPU-seconds |
| Memory | $0.00000250 per GiB-second | 360,000 GiB-seconds |
| Requests | $0.40 per million | 2 million requests |
The free tier translates to roughly: an app with 1 vCPU and 256 MiB serving 2 million requests per month with each request averaging 90ms of CPU time costs nothing. Past the free tier, the same workload costs single-digit dollars per month. Cloud Run only bills when a request is being processed (or when min-instances > 0, in which case idle instances bill at a reduced CPU rate of 10% of the active rate). For a detailed breakdown of all the GCP cost traps including Cloud Run’s connector costs, see our GCP Costs Explained guide.
Cleanup
Cloud Run services with min-instances=0 cost nothing when idle, but the Artifact Registry repository stores images that bill for storage. Clean everything up:
gcloud run services delete hello-app --region=europe-west1 --quiet
gcloud artifacts repositories delete my-app-repo --location=europe-west1 --quiet
Both deletions are instant. No cluster to tear down, no nodes to drain, no NAT gateway to wait for.
FAQ
Does Cloud Run support WebSockets?
Yes. Cloud Run supports WebSockets, gRPC, and server-sent events (SSE) on the managed platform. The request timeout is configurable up to 60 minutes. For long-lived connections like WebSockets, set --session-affinity so subsequent requests from the same client hit the same instance, and increase the timeout accordingly.
How do I connect Cloud Run to Cloud SQL on a private IP?
Two options. The recommended approach is a Serverless VPC Access connector that bridges Cloud Run into your VPC. The legacy approach is the Cloud SQL Auth Proxy sidecar, which Cloud Run supports as a second container in the service spec. The VPC connector is simpler and avoids the Auth Proxy overhead. For the full Cloud SQL setup, see our Cloud SQL PostgreSQL guide.
Can Cloud Run pull secrets from Secret Manager?
Yes. Use the --set-secrets flag to inject Secret Manager values as environment variables or mounted files at deploy time. The Cloud Run service agent needs roles/secretmanager.secretAccessor on the target secret. Pin to a specific version number rather than latest to avoid the alias gotcha documented in our Secret Manager tutorial.
What is the AWS equivalent of Cloud Run?
The closest match is an ECS Fargate service behind an Application Load Balancer. Both run containers without managing servers. Cloud Run is significantly simpler to set up (one gcloud run deploy command vs Fargate + ALB + target groups + task definition + service), has built-in HTTPS with managed certificates, and includes traffic splitting for canary deployments. Fargate offers more networking flexibility and integrates deeper with the AWS ecosystem. For workloads that need scale-to-zero, Cloud Run’s model is also comparable to AWS Lambda container images, though Lambda has a 15-minute execution timeout while Cloud Run allows up to 60 minutes.
How fast is the cold start?
With a distroless Go image under 10 MB, cold starts typically range from 500ms to 1.5 seconds including image pull and container initialization. Larger images (Python with dependencies, Java with Spring Boot) can take 3 to 8 seconds on a cold start. To eliminate cold starts for latency-sensitive services, set --min-instances=1 and accept the always-on cost for that one instance.
Where to Go Next
Cloud Run is usually the starting point for containerized workloads on GCP because it removes the infrastructure overhead entirely. The natural next steps are connecting it to a database (Secret Manager for credentials, Cloud SQL Auth Proxy or VPC connector for the connection), setting up CI/CD with GitHub Actions and Workload Identity Federation so deploys happen on push without JSON keys, and adding a custom domain with a Google-managed SSL certificate. When the workload outgrows Cloud Run (you need sidecars, StatefulSets, or the full Kubernetes API), the upgrade path is GKE Autopilot with the same container image. The official Cloud Run documentation is the reference worth bookmarking.