How do I find out which resource generates the most egress?

Query the BigQuery billing export for SKUs matching Network Internet, Inter-zone, or Inter-region, grouped by resource labels. For GKE specifically, use GKE Cost Allocation or kubecost.

GCP Costs Explained: 13 Hidden Billing Traps (2026)

Q: What is the single biggest GCP cost surprise?

Egress bandwidth. It is the most common why-is-the-bill-high root cause on a real project and the hardest to fix after the fact. Use Private Google Access, internal IPs, Cloud CDN, and co-locate chatty services.

Q: Is GKE Autopilot cheaper than GKE Standard?

For most workloads yes. Both cost $0.10/hr cluster management. Autopilot bills per pod request so you pay exactly what workloads ask for. Standard bills per node regardless of packing efficiency.

Q: Are Sustained Use Discounts still worth anything in 2026?

N1 and N2 families still get SUDs up to 30% off. E2, Tau T2D, C3, and newer families have no SUD. For newer machine types use Flexible Committed Use Discounts instead.

Q: Should I use Standard or Premium network tier?

Standard for same-continent workloads (30-60% cheaper, latency is usually imperceptible). Premium for genuinely global services where latency from every region matters.

Every GCP user has had some version of the same conversation with their manager. “Why is the bill this month $4,000 when I thought we were using free tier?” The answer is almost never “one massive VM we forgot about.” It is a collection of smaller line items that nobody flagged because nobody knew to look for them. Cloud NAT data processing on image pulls. Inter-zone egress between Kafka replicas. A flow log that generated 500 GiB a day. A reserved IP address attached to a VM that was deleted six months ago. A BigQuery dashboard that scans the full partitioned table every fifteen minutes because somebody forgot the WHERE clause. Each one looks small, none of them trip an alarm on their own, and together they turn what should be a $200 lab into a $4,000 surprise.

Original content from computingforgeeks.com - post 165742

This guide walks through the thirteen most common GCP cost traps with verified April 2026 pricing, how each one shows up in billing data, and the fix or mitigation that keeps them from happening again. Every price comes from the official Google Cloud pricing pages. Every gotcha has burned real money on a real project, including ours while testing this series. If you are coming from AWS, many of the patterns will feel familiar but the pricing shapes are different enough that assumptions carry over badly. Read this before you spin up anything substantial in a new GCP project.

Prices verified April 2026 against the official Google Cloud pricing pages. All figures in USD, us-central1 unless noted otherwise.

The Mental Model: Where the Bill Actually Comes From

A typical well-behaved GCP project breaks down spend roughly like this once you look at the billing export: compute (VMs or GKE nodes) 40-60%, egress and networking 15-25%, storage 10-20%, managed services (Cloud SQL, BigQuery, Cloud Run) 10-20%, and a long tail of everything else. The surprises almost always live in the networking and storage slices, not in the compute slice where people expect to find them. If your bill feels wrong, the first place to look is egress SKUs and any service that processes per-GB data volumes.

The single best investment any GCP team can make in cost visibility is enabling billing export to BigQuery. The export itself is free. You pay pennies per month for the table storage and nothing for the data transfer. Every line item shows up as a row within about four hours of being billed, including credits, SKU descriptions, resource names, and labels. Once you have the export, the SQL queries at the end of this guide become your friend for every “why is this bill weird” investigation.

1. Cloud NAT Data Processing

Cloud NAT pricing has two components. The gateway fee is trivial ($0.0014 per VM per hour up to 32 VMs, capped at $0.044/hour beyond that). The shock is the data processing: $0.045 per GiB, every region, applied to both directions of traffic. A single GKE node pulling one TiB of container images through Cloud NAT on a rebuild day generates about $46 in data processing alone, on top of whatever the egress bill adds.

The fix is Private Google Access. Enable it on the subnet and traffic to *.googleapis.com, Container Registry, Artifact Registry, Cloud Storage, and most other first-party Google services routes through Google’s internal backbone without traversing Cloud NAT at all. That means image pulls, Secret Manager reads, and BigQuery queries stop consuming Cloud NAT data processing. For third-party internet traffic (Ubuntu package mirrors, npm, Docker Hub if you still use it) you still pay, but those are usually a small fraction of total egress once Private Google Access is on.

Billing SKU to grep for in BigQuery export: look for descriptions containing Cloud NAT and especially Cloud NAT Data Processing. Anything above a few dollars per month on a small project is a signal to enable Private Google Access.

2. Egress Bandwidth (The Number One GCP Bill Killer)

Egress is where the largest surprises live. Current pricing from the VPC network pricing page:

Egress type	Price
Inter-zone, same region	$0.01 per GiB
Inter-region North America to North America	$0.02 per GiB
Inter-region NA to EU	$0.05 per GiB
Inter-region NA to Asia	$0.08 per GiB
To South America	$0.14 per GiB
Premium internet egress (1 GiB to 1 TiB)	$0.12 per GiB
Premium internet egress (1 TiB to 10 TiB)	$0.11 per GiB
Premium internet egress (above 10 TiB)	$0.08 per GiB
To China destinations	$0.23 per GiB

Two things catch people by surprise. First, inter-zone traffic inside the same region is billed. A multi-zone Kafka cluster with replicas spread across zones generates inter-zone traffic on every message replication. Ten TB per month of inter-zone traffic costs around $100, which is small unless you are running dozens of such clusters, in which case it is the biggest line item on the bill. Second, hitting a VM by its external IPv4 address always counts as leaving the zone, even if the target VM is in the exact same zone. Use internal IPs whenever you can.

Detect with a BigQuery export query filtering SKU descriptions for Network Internet, Inter-zone, and Inter-region. The cleanest way to cut egress is to co-locate chatty workloads in one zone, use VPC-native GKE (so Pod-to-Pod traffic is not billed as inter-zone), and put Cloud CDN in front of anything that serves repeated content to the internet.

3. Persistent Disk Snapshots

Standard regional snapshots cost $0.000068493 per GiB-hour, which works out to approximately $0.05 per GiB per month. Multi-regional standard snapshots are $0.083 per GiB per month, a 66% premium. Archive snapshots are $0.019 per GiB per month with a 90-day minimum retention.

Snapshots are incremental, which lulls teams into thinking they are cheap. They are, until a scheduled daily snapshot runs on a 500 GB database disk for a year. The first snapshot stores 500 GB. Each subsequent day stores only the changed blocks, but a busy database changes a lot of blocks, and twelve months later the chain has accumulated two or three TB of snapshot storage. A forgotten hourly schedule on twenty VMs quietly drifts past $400 per month.

The fix is to always set a retention policy when creating a snapshot schedule. The gcloud compute resource-policies create snapshot-schedule command accepts a --max-retention-days flag. Thirty days is a sensible default for dev environments. Production backups usually want a combination of daily for thirty days, weekly for thirteen weeks, and monthly for twelve months, which is still cheaper than unbounded retention. Audit with gcloud compute snapshots list --sort-by=~creationTimestamp and look for anything older than your documented retention policy.

4. Idle Static External IPs

This one deserves a special mention because the pricing is counterintuitive. A static external IP address assigned to a running VM costs $0.005 per hour (about $3.65 per month). A static external IP address reserved but not attached to anything costs $0.01 per hour, or $7.20 per month, double the cost of an in-use IP. Google’s theory is that reserved IPs are a scarce resource so unused ones should be discouraged. The result is that teams who delete VMs but leave reserved IP addresses behind keep paying for something nobody uses.

Ten orphaned IPs is $72 per month in pure waste. Audit with:

gcloud compute addresses list --filter="status=RESERVED"

Every entry in that output is actively burning money. Delete with gcloud compute addresses delete. Set up a cleanup cron that flags any RESERVED address older than seven days and either sends a Slack alert or auto-deletes based on a label. Label every reserved IP with the owning application at creation time so nothing is orphaned without a paper trail.

5. BigQuery On-Demand: The $312 Query

BigQuery on-demand pricing is $6.25 per TiB scanned, with the first 1 TiB per month free. The trap is that on-demand queries have no spending cap by default. One SELECT * against a 50 TiB partitioned table without a filter costs $312 for the single query. A Looker dashboard that refreshes hourly against a fat table and scans 200 GiB on each refresh costs roughly $9 per day, which becomes $270 per month that shows up as “analytics spend” and nobody questions it.

Three defenses, in order of effectiveness. First, set maximum_bytes_billed on every query either via session default or query hint. Queries that would exceed the limit fail loudly instead of silently charging. Second, configure custom query quotas at the project and user level via the GCP quota management page. The “Query usage per day” quota lets you cap the total TiB per user per day. Third, partition and cluster every large table so well-written queries read a fraction of the total bytes instead of the whole table.

For steady workloads, capacity pricing (BigQuery editions) is often thirty to fifty percent cheaper than on-demand. Standard edition is $0.04 per slot-hour on-demand, with 1-year commit at $0.036 and 3-year commit at $0.032. If your BigQuery spend is predictable, do the math on capacity and switch. If your spend is spiky, on-demand with strict quotas is usually the right answer.

6. Cloud Logging Ingestion (The Data Access Logs Trap)

Cloud Logging ingestion is $0.50 per GiB beyond the first 50 GiB free per project per month. That sounds generous until somebody enables Cloud Audit Data Access logs on a busy Cloud Storage bucket. Data Access logs on a public-facing bucket with a lot of reads can produce 100 GiB of log entries per day, which at $0.50 per GiB beyond the free tier works out to about $50 per day per bucket. A well-intentioned compliance toggle becomes a $1,500 per month surprise overnight.

Vended logs (VPC Flow Logs, Firewall Logs, Cloud NAT Logs) are billed separately at $0.25 per GiB ingestion with no free tier. The free 50 GiB only applies to application logs written via the Cloud Logging API, not to infrastructure logs emitted by Google’s own services. This matters because teams assume they have a 50 GiB cushion on flow logs and then get surprised.

Fixes: keep Data Access logs scoped to specific sensitive resources, never project-wide. Write exclusion filters in the _Default log sink to drop noisy log types before they count against ingestion. Route compliance logs directly to Cloud Storage via a sink, where long-term retention is cheaper than keeping logs in Cloud Logging. Audit active sinks with gcloud logging sinks list and review exclusion filters quarterly.

7. GKE Cluster Management Fee

Every GKE cluster, Standard or Autopilot, regional or zonal, is billed a flat $0.10 per cluster per hour for the control plane. That is $73 per month per cluster before any nodes are provisioned. Ten dev clusters “with no nodes running” still cost $730 per month just in management fees.

There is one free credit worth knowing. Every GCP billing account gets a $74.40 per month GKE management fee credit, which covers exactly one zonal Standard cluster or one Autopilot cluster. Regional Standard clusters do not consume the credit because of how the billing math works, so if you want the free cluster you need a zonal or Autopilot setup. This is why our article testing lab uses a single Autopilot cluster and tears it down after each session: the management fee is effectively free for one cluster at a time, and the Autopilot compute bill scales to the exact pods we request.

For teams that actually need dev clusters: consolidate. One shared cluster with namespaces and RBAC costs the same as one cluster, whereas ten per-developer clusters cost ten times the management fee before any workload runs. Delete clusters in CI/CD tear-down hooks. If you want something that survives an article lab session, check our GKE Workload Identity guide where the teardown command is literally the last step of the tested walkthrough.

8. Cloud SQL Idle Instances

Cloud SQL pricing is per-second, billed for vCPU and memory independently. Enterprise edition general-purpose costs $0.0413 per vCPU per hour and $0.007 per GiB of memory per hour. High availability doubles both. There is no “stop equals free” state: a stopped Cloud SQL instance still bills for storage and for the instance itself unless you fully delete it.

A minimal 2 vCPU / 8 GiB PostgreSQL instance costs roughly (2 * 0.0413) + (8 * 0.007) per hour, or about $101 per month with zero connections. High availability doubles that to $202. This is the same instance you might spin up on a developer’s machine for $0. For dev and staging workloads, the right pattern is often either a single shared Cloud SQL instance with multiple logical databases (one Cloud SQL, ten teams), or a small Compute Engine VM running PostgreSQL directly which costs a fraction of the same footprint.

Cloud SQL does not currently offer a serverless scale-to-zero mode for PostgreSQL or MySQL. If your dev instances can tolerate a 30-second cold start, stop them outside business hours with instances.patch activationPolicy=NEVER. This pauses vCPU and memory billing but still charges for storage. Not ideal, but better than 24/7 billing.

9. VPC Flow Logs Ingestion

VPC Flow Logs are billed through Cloud Logging as vended logs at $0.25 per GiB ingestion with no free tier. A single busy GKE subnet with default flow log settings (5-second aggregation, 0.5 sample rate) can emit 500 GiB per day, which is $125 per day or about $3,750 per month per subnet. Multiply by every subnet if somebody enabled flow logs “for security” across the entire VPC and you are looking at the largest single line item on the bill.

The fix is tuning. Set aggregation interval to 15 minutes, sampling rate to 0.1 or 0.01, and enable flow logs only on subnets that actually need the audit data. Route them to BigQuery via a log sink where storage is cheaper than retaining in Cloud Logging’s default buckets. For most teams, “flow logs on every subnet by default” is overkill; flow logs on the DMZ subnet and nowhere else catches real security incidents at a fraction of the cost.

10. Committed Use Discounts You Forgot to Buy

GCP offers two overlapping discount programs. Sustained Use Discounts apply automatically after 25% of the month of continuous use, up to 30% off at full-month utilization. They sound great until you notice the fine print: E2, Tau T2D, C3, and most newer machine families have no SUD at all. If your fleet is mostly E2 or C3, you are paying full list price regardless of uptime. Committed Use Discounts are the explicit contract: commit to a fixed spend or machine count for 1 or 3 years in exchange for roughly 25% (1 year) to 55% (3 years) off the on-demand rate. Flexible CUDs are the safer default because they apply to a dollar amount of spend rather than specific machine types, which means they follow you across machine family changes.

Audit via the Billing Console FinOps Hub. The “Commitment analysis” report shows your steady-state baseline (P20 daily spend over the last 90 days) and tells you exactly how much to commit. For any workload running steady at a predictable level, flexible CUDs pay for themselves within a month and the 1-year term is basically risk-free. The three-year term is only worth it if you are certain the workload will still be on GCP at the end of the commit period.

11. Load Balancer Forwarding Rules

Load balancer pricing has a per-rule base charge that surprises teams building microservice architectures the wrong way. The first five forwarding rules are $0.025 per hour each, which is $18.25 per month per rule. Rules six and beyond drop to $7.30 per month. On top of that, the L7 external Application Load Balancer bills $0.008 per GiB for both inbound and outbound data processing.

Ten microservices each with their own external HTTPS Load Balancer costs 10 * $18.25 = $182.50 per month in forwarding rules alone, plus data processing. The same ten microservices behind one shared global External Application Load Balancer with URL maps routing api.example.com, app.example.com, and admin.example.com to different backend services costs $18.25 for the single forwarding rule plus data processing. The URL map feature is free and handles exactly this pattern.

Audit with gcloud compute forwarding-rules list --global and --regions=all. Put Cloud CDN in front of anything serving static content to cut outbound data processing by whatever percentage of traffic is cacheable.

12. Premium vs Standard Network Tier

Every VM you create on GCP defaults to Premium network tier. Premium routes traffic over Google’s global backbone from end to end, which is the right choice for global services that need consistent low latency across continents. For a single-region workload serving users in the same continent, Standard tier is about 30% cheaper on the first TiB and up to 60% cheaper at scale, and the latency difference is usually imperceptible.

Standard tier egress to North America destinations starts at $0.085 per GiB (first 200 GiB free in some configurations), drops to $0.065 per GiB in the 10 to 150 TiB range, and $0.045 per GiB above 150 TiB. Compared to Premium’s $0.12 / $0.11 / $0.08 ladder, the savings are real.

Switch the project default with one command:

gcloud compute project-info update --default-network-tier=STANDARD

For any workload that is not genuinely global, this is a pure win. Keep Premium only for services that actively benefit from the global anycast routing.

13. The “Always Free” e2-micro Regional Trap

The Always Free tier on GCP includes one e2-micro instance per month. The fine print that trips people up: the free tier is only available in us-west1 (Oregon), us-central1 (Iowa), and us-east1 (South Carolina). Spin up an e2-micro in europe-west1 “just for low latency to my users” and it bills at the full rate, which is roughly $6 to $7 per month depending on disk size. The free tier is not region-agnostic, and the free allotment does not cover e2-small or anything larger.

Other free-tier gotchas: the free 30 GB persistent disk only covers one VM. Any secondary disk or disk larger than 30 GB is billed normally. Egress is free up to 1 GiB per month to most destinations except China and Australia, so a VM with steady traffic will still see an egress line item. The free tier is a learning aid, not a production hosting plan.

Billing Export to BigQuery: The Cost-Tracking Query Every Team Needs

Enable billing export to BigQuery from the Billing Console (Billing → Billing export → BigQuery export). Select a dataset and turn on both Standard usage cost and Detailed usage cost. The export is free and data starts flowing within a few hours, then refreshes every four hours or so.

The basic “where is my money going” query, grouped by service:

SELECT
  service.description AS service,
  SUM(cost) AS cost,
  SUM(CAST(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0) AS NUMERIC)) AS credits,
  SUM(cost) + SUM(CAST(IFNULL((SELECT SUM(c.amount) FROM UNNEST(credits) c), 0) AS NUMERIC)) AS net_cost
FROM `PROJECT.DATASET.gcp_billing_export_v1_BILLING_ACCOUNT_ID`
WHERE _PARTITIONTIME BETWEEN TIMESTAMP('2026-03-01') AND TIMESTAMP('2026-04-01')
GROUP BY service
ORDER BY net_cost DESC;

Drill into a specific trap (Cloud NAT for example) by SKU description:

SELECT sku.description,
       SUM(usage.amount_in_pricing_units) AS units,
       usage.pricing_unit,
       SUM(cost) AS cost
FROM `PROJECT.DATASET.gcp_billing_export_v1_BILLING_ACCOUNT_ID`
WHERE _PARTITIONTIME >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND sku.description LIKE '%NAT%'
GROUP BY sku.description, usage.pricing_unit
ORDER BY cost DESC;

Swap the LIKE clause for %Snapshot%, %Flow Log%, %Inter-zone%, or any of the other traps in this guide. Schedule a weekly query that writes the top ten SKUs by cost to a Slack or email notification and any surprise line item surfaces before it becomes a $4,000 problem.

Setting Up Budget Alerts (Do This Before Anything Else)

The single most important thing any new GCP project should do on day one is set up a budget with threshold alerts. Budgets are free, they take one command, and they are the difference between “we had a runaway BigQuery query” and “we had a runaway BigQuery query that nobody noticed until the credit card bounced.” Create a budget with alerts at 50%, 75%, 90%, and 100% of the expected monthly spend:

gcloud billing budgets create \
  --billing-account=BILLING_ACCOUNT_ID \
  --display-name="Monthly project budget" \
  --budget-amount=500 \
  --threshold-rule=percent=0.5,basis=current-spend \
  --threshold-rule=percent=0.75,basis=current-spend \
  --threshold-rule=percent=0.9,basis=current-spend \
  --threshold-rule=percent=1.0,basis=current-spend

Hook the budget to a Pub/Sub topic and a Cloud Function that pages whoever is on-call when the 90% threshold fires. Budgets fire email notifications by default, but emails get ignored. A pager integration forces acknowledgment and dramatically shortens the “something is wrong with our bill” response time.

FAQ

What is the single biggest GCP cost surprise?

Egress bandwidth. It is the most common “why is this bill so high” root cause on a real project and the hardest to fix after the fact because application architecture dictates how much cross-zone and internet egress the workload generates. Fixing it means changing where services talk to each other, which is a bigger change than anything else on this list. Enable Private Google Access, use internal IPs everywhere, co-locate chatty services in one zone, put Cloud CDN in front of anything internet-facing, and measure everything via the BigQuery billing export.

Is GKE Autopilot cheaper than GKE Standard?

For most workloads, yes. Both cost the same $0.10 per hour cluster management fee. Autopilot bills per pod request (vCPU, memory, ephemeral storage) so you pay for exactly what your workloads ask for, with no wasted node capacity. Standard bills per node regardless of pod packing efficiency, which on small or bursty workloads leaves significant unused capacity. Autopilot becomes more expensive than Standard only when you are running workloads that perfectly fill nodes and have no spare capacity, which is rare in practice.

Are Sustained Use Discounts still worth anything in 2026?

On N1 and N2 machine families they still apply automatically up to 30% off at full-month utilization. On E2, Tau T2D, C3, and most newer families they have been eliminated. For workloads on newer machine types, Committed Use Discounts (Flexible CUD at the spend level) are the primary discount mechanism and you need to buy them explicitly. The FinOps Hub in the Billing Console shows exactly which discounts you are already getting and which ones you are leaving on the table.

How do I find out which resource is generating the most egress?

The detailed billing export to BigQuery includes resource labels when available. Query for SKUs matching Network Internet, Inter-zone, or Inter-region, group by the resource labels, and sort by cost. For GKE-specific breakdown, install kubecost or use the Cost Allocation feature in GKE which attributes egress to namespaces and workloads. For VMs outside GKE, the label-based attribution in the billing export is the cleanest answer.

Does GCP have the equivalent of AWS Savings Plans?

Flexible Committed Use Discounts are the closest equivalent. You commit to a dollar-per-hour spend for one or three years and get roughly 25% (1 year) to 55% (3 years) off the on-demand rate. Unlike the older per-machine-type CUDs, Flexible CUDs follow you across machine families and regions, which is how AWS Savings Plans work. For any workload with a predictable steady-state baseline, Flexible CUDs at the 1-year term are essentially risk-free and should be the default. For the AWS version of this conversation, see our AWS Costs Explained guide.

Should I use Standard or Premium network tier?

Standard for anything that serves users in the same continent as the workload. The cost savings are 30 to 60% on egress and the latency difference is usually imperceptible for same-continent traffic. Premium for genuinely global services where users on three continents hit the same endpoint and you want consistent low latency from every region. Default to Standard, upgrade specific services to Premium only when you can measure a latency difference that matters.

Where to Go Next

Cost discipline is a habit, not a one-time project. The habits that matter most: enable billing export to BigQuery on every project, set budget alerts with pager integration, run the SKU query weekly, and audit for orphaned resources monthly. For the AWS counterpart of this guide, see our AWS Costs Explained. For the GCP-specific services that show up repeatedly in these traps, our GKE Workload Identity guide and Google Cloud Secret Manager tutorial are the tested walkthroughs that match this article’s approach of “real commands, real output, real costs.” The billing export documentation and the VPC network pricing page are the two reference docs worth bookmarking for anyone with financial accountability for a GCP project.