Managing PostgreSQL on a VM gives you full control, but it also gives you full responsibility: patching, backups, failover, connection pooling, and the 3 AM pager when replication breaks. Cloud SQL takes those off your plate. You get a managed PostgreSQL instance with automated backups, point-in-time recovery, and optional high availability, all provisioned through Terraform so the configuration lives in version control where it belongs.
This guide covers creating a Cloud SQL PostgreSQL 17 instance using both gcloud and Terraform, configuring private IP networking via VPC peering, enabling IAM database authentication, setting up read replicas, and connecting from GKE using the Cloud SQL Auth Proxy. For securing database credentials in GCP, see the Secret Manager tutorial. If your workloads run on GKE, the Workload Identity guide explains how the Auth Proxy authenticates without exporting service account keys.
Verified working: April 2026. Cloud SQL PostgreSQL 17, Enterprise edition, Terraform google provider 6.x
Cloud SQL vs Self-Managed PostgreSQL
The trade-off is cost versus operational burden. Cloud SQL costs more per vCPU-hour than a Compute Engine VM running PostgreSQL, but you do not spend engineering time on patching, backup validation, or failover testing.
| Feature | Cloud SQL (Enterprise) | Self-Managed on GCE |
|---|---|---|
| Patching | Automated (maintenance window) | Manual, your responsibility |
| Backups | Automated daily + PITR, 7-day default retention | pg_dump / pgBackRest, self-managed |
| High Availability | One checkbox (regional HA with automatic failover) | Patroni/repmgr + load balancer, significant setup |
| Read Replicas | API call or Terraform resource | Manual streaming replication config |
| Connection Pooling | Built-in (pgBouncer via AlloyDB Omni, or Auth Proxy) | PgBouncer/Pgpool, self-managed |
| IAM Auth | Native (no passwords in connection strings) | Not applicable |
| Scale to Zero | Not supported (minimum 1 vCPU always running) | Not applicable |
| Max Storage | 64 TB | Limited by disk size |
| PostgreSQL Versions | 14, 15, 16, 17 | Any version you compile |
Pricing
Cloud SQL Enterprise edition charges per vCPU-hour and per GiB-hour of memory. There is no free tier for production instances (the free trial gives $300 in credits). Here are the real numbers for us-central1 as of April 2026:
| Resource | Rate | Minimal Instance (2 vCPU, 8 GiB) |
|---|---|---|
| vCPU | $0.0413/hr | $60.30/month |
| Memory | $0.007/GiB-hr | $40.88/month |
| SSD Storage | $0.170/GiB-month | $1.70/month (10 GiB) |
| Total (single zone) | ~$103/month | |
| Total (HA, 2 zones) | ~$206/month |
HA doubles the compute cost because GCP runs a standby instance in another zone. Storage is shared, so it does not double. For a full breakdown of how GCP services accumulate cost, the GCP costs guide covers all the common gotchas.
Prerequisites
- GCP project with billing enabled
- APIs enabled:
sqladmin.googleapis.com,compute.googleapis.com,servicenetworking.googleapis.com gcloudCLI authenticated (gcloud auth application-default login)- Terraform 1.5+ with
googleprovider 6.x - A VPC network (default or custom)
Enable the required APIs:
gcloud services enable sqladmin.googleapis.com \
compute.googleapis.com \
servicenetworking.googleapis.com
Create an Instance with gcloud
For a quick test or one-off instance, gcloud is the fastest path.
gcloud sql instances create pg-demo \
--database-version=POSTGRES_17 \
--tier=db-custom-2-8192 \
--region=us-central1 \
--storage-size=10GB \
--storage-type=SSD \
--storage-auto-increase \
--backup-start-time=03:00 \
--enable-point-in-time-recovery \
--maintenance-window-day=SUN \
--maintenance-window-hour=4 \
--deletion-protection
Instance creation takes 3-5 minutes. Once ready, set the postgres user password:
gcloud sql users set-password postgres \
--instance=pg-demo \
--password='YourSecurePassword2026!'
Create a database:
gcloud sql databases create appdb --instance=pg-demo
Create with Terraform
Terraform gives you reproducible, version-controlled infrastructure. The configuration below creates the instance, database, and user.
resource "google_sql_database_instance" "postgres" {
name = "pg-demo"
database_version = "POSTGRES_17"
region = "us-central1"
project = PROJECT_ID
deletion_protection = false # Set true in production
settings {
tier = "db-custom-2-8192"
disk_size = 10
disk_type = "PD_SSD"
disk_autoresize = true
availability_type = "ZONAL" # "REGIONAL" for HA
backup_configuration {
enabled = true
start_time = "03:00"
point_in_time_recovery_enabled = true
transaction_log_retention_days = 7
backup_retention_settings {
retained_backups = 7
}
}
maintenance_window {
day = 7 # Sunday
hour = 4
update_track = "stable"
}
ip_configuration {
ipv4_enabled = false
private_network = google_compute_network.vpc.id
}
database_flags {
name = "cloudsql.iam_authentication"
value = "on"
}
}
depends_on = [google_service_networking_connection.private_vpc_connection]
}
resource "google_sql_database" "appdb" {
name = "appdb"
instance = google_sql_database_instance.postgres.name
}
resource "google_sql_user" "app_user" {
name = "appuser"
instance = google_sql_database_instance.postgres.name
password = var.db_password
}
Store the db_password variable in Secret Manager or a terraform.tfvars file excluded from version control. Never hardcode passwords in Terraform configs.
Private IP Networking
By default, Cloud SQL gets a public IP. For production, disable the public IP and use private IP via VPC peering. This keeps database traffic off the internet entirely.
The private IP setup requires three resources: a reserved IP range, a VPC peering connection to Google’s service networking, and the Cloud SQL instance configured with ipv4_enabled = false.
resource "google_compute_global_address" "private_ip_range" {
name = "cloudsql-private-ip"
purpose = "VPC_PEERING"
address_type = "INTERNAL"
prefix_length = 16
network = google_compute_network.vpc.id
}
resource "google_service_networking_connection" "private_vpc_connection" {
network = google_compute_network.vpc.id
service = "servicenetworking.googleapis.com"
reserved_peering_ranges = [google_compute_global_address.private_ip_range.name]
}
With this in place, the Cloud SQL instance gets an internal IP from your VPC range. GKE pods and Compute Engine VMs in the same VPC can reach it directly. No public endpoint, no Cloud SQL Auth Proxy needed for basic connectivity (though the proxy adds connection pooling and IAM auth, which are still valuable).
IAM Database Authentication
IAM auth eliminates passwords for database connections. Instead, the connecting service account gets a short-lived OAuth2 token that Cloud SQL validates. This is the recommended approach for GKE workloads using Workload Identity.
Enable IAM auth on the instance (we already set the database flag in Terraform). Create an IAM database user:
resource "google_sql_user" "iam_user" {
name = "app-sa@PROJECT_ID.iam.gserviceaccount.com"
instance = google_sql_database_instance.postgres.name
type = "CLOUD_IAM_SERVICE_ACCOUNT"
}
Grant the service account the roles/cloudsql.instanceUser role and roles/cloudsql.client role. The instanceUser role allows login, while client allows connecting via the Auth Proxy.
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:app-sa@PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/cloudsql.instanceUser"
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:app-sa@PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/cloudsql.client"
Then grant database-level permissions inside PostgreSQL:
GRANT ALL PRIVILEGES ON DATABASE appdb TO "app-sa@PROJECT_ID.iam";
Backups and Point-in-Time Recovery
Cloud SQL automated backups are enabled by default with a 7-day retention window. The backup runs daily at the time you specify (03:00 in our config). Point-in-time recovery uses the WAL (write-ahead log) to restore to any second within the retention window.
Restore to a specific timestamp:
gcloud sql instances clone pg-demo pg-demo-restored \
--point-in-time="2026-04-10T14:30:00.000Z"
This creates a new instance from the backup. Cloud SQL does not support in-place PITR because that would require downtime on the running instance. The clone approach lets you validate the restore before switching traffic.
Read Replicas
Read replicas use PostgreSQL streaming replication under the hood. They are eventually consistent (replication lag is typically under 1 second) and can be in the same region or a different one for disaster recovery.
resource "google_sql_database_instance" "read_replica" {
name = "pg-demo-replica"
master_instance_name = google_sql_database_instance.postgres.name
region = "us-central1"
database_version = "POSTGRES_17"
replica_configuration {
failover_target = false
}
settings {
tier = "db-custom-2-8192"
disk_autoresize = true
disk_type = "PD_SSD"
ip_configuration {
ipv4_enabled = false
private_network = google_compute_network.vpc.id
}
}
}
Point read-heavy application queries at the replica’s IP to offload the primary. Connection strings in your application should distinguish between write (primary) and read (replica) endpoints.
Connect from GKE with the Auth Proxy
The Cloud SQL Auth Proxy handles encryption, IAM auth, and connection management. On GKE, run it as a sidecar container in the same pod as your application. This pattern means your app connects to localhost:5432 and the proxy handles everything else.
containers:
- name: app
image: your-app:latest
env:
- name: DB_HOST
value: "127.0.0.1"
- name: DB_PORT
value: "5432"
- name: DB_NAME
value: "appdb"
- name: cloud-sql-proxy
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.14.3
args:
- "--structured-logs"
- "--auto-iam-authn"
- "PROJECT_ID:us-central1:pg-demo"
securityContext:
runAsNonRoot: true
The --auto-iam-authn flag enables automatic IAM authentication. Combined with GKE Workload Identity, no service account key file is needed. The Kubernetes service account maps to a GCP service account that has roles/cloudsql.client, and the proxy uses it transparently.
Connect from Compute Engine
If your application runs on a Compute Engine VM in the same VPC, you can connect directly to the private IP without the Auth Proxy. Install the PostgreSQL client:
sudo apt install -y postgresql-client
Connect using the private IP (find it in the Cloud SQL instance details or Terraform output):
psql -h 10.0.1.50 -U appuser -d appdb
For production, still use the Auth Proxy even on Compute Engine. It adds connection pooling and handles SSL certificate rotation automatically.
Monitoring
Cloud SQL exposes metrics natively in Cloud Monitoring (no agent installation needed). The most important metrics to watch:
- database/cpu/utilization: sustained usage above 80% means it is time to scale up vCPUs
- database/memory/utilization: PostgreSQL uses shared_buffers aggressively, so 70-80% usage is normal
- database/disk/utilization: enable auto-resize and alert at 85%
- database/postgresql/num_backends: connection count approaching
max_connectionsmeans you need connection pooling - database/replication/replica_byte_lag: for read replicas, sustained lag above 1 MB indicates the replica cannot keep up
Create an alert policy for high CPU:
gcloud alpha monitoring policies create \
--notification-channels=CHANNEL_ID \
--display-name="Cloud SQL CPU > 80%" \
--condition-display-name="High CPU" \
--condition-filter='resource.type="cloudsql_database" AND metric.type="cloudsql.googleapis.com/database/cpu/utilization"' \
--condition-threshold-value=0.8 \
--condition-threshold-comparison=COMPARISON_GT \
--condition-threshold-duration=300s
Production Checklist
Before going live, verify these settings:
- High Availability: set
availability_type = "REGIONAL"in Terraform. This creates a standby in another zone with automatic failover (doubles compute cost) - Maintenance window: schedule during lowest-traffic hours. Maintenance can cause a brief restart
- Storage auto-resize: enabled by default, but set a storage auto-resize limit to prevent runaway growth from a bug flooding the database
- Deletion protection: set
deletion_protection = truein Terraform. Without it, aterraform destroydeletes the database with no confirmation - Private IP only: disable the public IP (
ipv4_enabled = false). If you need occasional public access for debugging, use the Auth Proxy from your local machine instead - Database flags: tune
shared_buffers(25% of RAM),work_mem, andmax_connectionsbased on workload. Cloud SQL exposes these as database flags
Cloud SQL vs AWS RDS PostgreSQL
If you are evaluating both clouds, this comparison covers the differences that actually matter in practice.
| Feature | GCP Cloud SQL | AWS RDS PostgreSQL |
|---|---|---|
| Pricing model | Per vCPU-hour + per GiB-hour | Per instance-hour (fixed tiers) |
| Minimal instance | ~$103/month (2 vCPU, 8 GiB) | ~$49/month (db.t4g.medium, 2 vCPU, 4 GiB) |
| HA architecture | Regional (standby in another zone) | Multi-AZ (synchronous standby) |
| Read replicas | Same or cross-region | Same or cross-region, up to 15 |
| Serverless scale-to-zero | Not supported | Aurora Serverless v2 (scales to 0.5 ACU) |
| Private networking | VPC peering (google_service_networking) | Subnet placement (no peering needed) |
| IAM auth | Native (IAM database users) | Supported (RDS IAM auth tokens) |
| Connection proxy | Cloud SQL Auth Proxy (sidecar) | RDS Proxy (managed, $$$) |
| Backup retention | 1-365 days | 0-35 days (automated), manual snapshots unlimited |
| Max storage | 64 TB | 64 TB (128 TB with io2) |
RDS wins on entry-level pricing because of smaller instance types (t4g.micro starts at ~$12/month). Cloud SQL wins on IAM integration depth and the Auth Proxy’s zero-config connection handling. Both are solid choices; pick the one that fits your existing cloud footprint.
Troubleshooting
Error: “Failed to create subnetwork. Couldn’t find free blocks in allocated IP ranges”
The reserved IP range for VPC peering is exhausted. Either the prefix_length is too small or another Cloud SQL instance already consumed the range. Increase the prefix length (e.g., from /24 to /16) or create an additional reserved range.
Error: “Connection timed out” from GKE pods
The most common cause is a missing VPC peering route. Verify the peering connection is active:
gcloud compute networks peerings list --network=NETWORK_NAME
If the peering shows ACTIVE but connections still time out, check that the GKE cluster’s node network can route to the Cloud SQL private IP range. On Shared VPC setups, the host project must have the peering, not the service project.
Error: “FATAL: Cloud SQL IAM user authentication failed”
The IAM database user does not match the connecting service account. The username must be the full email without the .gserviceaccount.com domain suffix for PostgreSQL IAM users, or with it for Cloud IAM service account types. Double-check the google_sql_user resource type: it should be CLOUD_IAM_SERVICE_ACCOUNT for service accounts and CLOUD_IAM_USER for human users.
Terraform destroy fails with “deletion_protection is enabled”
Set deletion_protection = false in the Terraform config, run terraform apply to update the instance, then run terraform destroy. This two-step process is intentional: it prevents accidental destruction of production databases.
Cleanup
Remove all resources. If using Terraform:
terraform destroy
If the instance has deletion_protection enabled, disable it first:
gcloud sql instances patch pg-demo --no-deletion-protection
gcloud sql instances delete pg-demo
Delete the read replica separately (replicas must be deleted before the primary if using gcloud):
gcloud sql instances delete pg-demo-replica
FAQ
Can Cloud SQL PostgreSQL scale to zero?
No. Cloud SQL always runs at least one instance with the configured vCPU and memory. There is no serverless mode that scales to zero. If you need scale-to-zero for development databases, consider AlloyDB Omni (self-hosted) or Aurora Serverless v2 on AWS.
What PostgreSQL versions does Cloud SQL support?
As of April 2026, Cloud SQL supports PostgreSQL 14, 15, 16, and 17. Version 17 is the latest available. Major version upgrades are supported in-place, but test the upgrade on a clone first because some extensions may need recompilation.
How does high availability work?
Regional HA creates a standby instance in a different zone within the same region. Replication is synchronous: every write is confirmed on both the primary and standby before being acknowledged to the client. Failover is automatic and typically completes in under 60 seconds. The IP address stays the same after failover.
Is the Cloud SQL Auth Proxy required?
Not required, but strongly recommended. Without the proxy, you connect directly to the private IP using a password. The proxy adds automatic SSL/TLS encryption, IAM-based authentication (no passwords), and connection health checks. On GKE, run it as a sidecar. On Compute Engine, run it as a systemd service.
How do I migrate from self-managed PostgreSQL to Cloud SQL?
Use the Database Migration Service (DMS). Create a migration job with the source as your self-managed instance and the destination as a new Cloud SQL instance. DMS handles the initial full dump and then continuous replication via logical decoding until you are ready to cut over. Test the migration with a dry run first.