Consolidation patterns that depend on good intentions decay fast. One PR at 5pm on a Friday that adds a google_compute_managed_ssl_certificate in a new project, and the cert inventory starts sprawling again. The only durable fix is to make the shared path the path of least resistance, and to block the sprawl path with CI. This article publishes a reusable Terraform module that onboards a new service onto the shared LB in a 5-line Terragrunt file, plus an OPA/Conftest policy that fails terraform plan when someone tries to create a per-service cert outside the module. The two together turn the consolidation outcome from “an ongoing initiative” into “the default, enforced.”
Tested April 2026 with Terragrunt 0.69, OpenTofu 1.10, Conftest 0.54, google provider 6.12. Module is published under infra-gcp/modules/service-onboarding/ and stands alongside the earlier modules in this series.
What the Module Replaces
Every service onboarding used to be: create a DNS A record, issue a ManagedCertificate, create an Ingress with its own static IP, wait for the cert to provision. Four to six files per service, spread across two or three repositories, with enough shape for a typo to produce a broken cert three days later.
After consolidation, the shared LB + cert map from the previous articles already covers every *.cfg-lab hostname. A new service needs: a hostname registered in DNS pointing at the shared LB, a URL-map host rule routing to the new backend, and a pointer to the backend itself. The cert is already there. The LB is already there. Onboarding is glue, not new infrastructure.
The service-onboarding module is exactly that glue.
Module Inputs and Shape
The target API looks like this:
module "market_onboarding" {
source = "../../modules/service-onboarding"
name = "market"
hostname = "market.cfg-lab.computingforgeeks.com"
dns_zone_name = "cfg-lab"
shared_lb_ip = dependency.gxlb.outputs.ip_address
url_map_name = dependency.gxlb.outputs.url_map_name
backend_type = "cloud_run" # or "gke_neg", "internet_neg", "backend_bucket"
backend_ref = google_cloud_run_service.market.name
environment = "lab"
}
One module call. Five primary inputs. The output is a DNS A record, a URL-map host rule, a backend resource of the requested type, and a URL-map path matcher routing the hostname at the backend. The cert map and LB are consumed by dependency, not created here; the module enforces the shared path by not exposing a cert-creation input at all.
The Module Body (Core)
The module’s main file wires a DNS A record pointing at the shared LB, a dispatched backend based on the requested backend type, and a backend service that the shared URL map consumes. Labels tie every resource back to the module for auditing.
resource "google_dns_record_set" "a" {
project = var.project_id
managed_zone = var.dns_zone_name
name = "${var.hostname}."
type = "A"
ttl = 300
rrdatas = [var.shared_lb_ip]
}
locals {
backend_id = lookup({
"cloud_run" = google_compute_region_network_endpoint_group.cloud_run[0].id
"gke_neg" = var.backend_ref
"internet_neg" = google_compute_global_network_endpoint_group.internet[0].id
"backend_bucket"= google_compute_backend_bucket.bucket[0].id
}, var.backend_type)
}
resource "google_compute_backend_service" "this" {
project = var.project_id
name = "${var.name}-backend"
backend {
group = local.backend_id
}
load_balancing_scheme = "EXTERNAL_MANAGED"
labels = merge(var.common_labels, {
service = var.name
environment = var.environment
managed_by = "service-onboarding-module"
})
}
Three things are deliberately missing. No google_compute_managed_ssl_certificate. No google_compute_target_https_proxy. No google_compute_global_forwarding_rule. The module cannot produce per-service cert sprawl because the resources that produce it are not in the module. If someone wants to fork the module and add those, the OPA policy blocks the PR.
URL Map Host Rule Registration
The shared LB’s URL map holds the routing logic. The module can’t overwrite the URL map (it’s managed by the gxlb stack), but it can export a host-rule stub that the gxlb module consumes:
output "url_map_host_rule" {
value = {
hosts = [var.hostname]
path_matcher = var.name
service_id = google_compute_backend_service.this.id
}
}
The gxlb module iterates over a map of these stubs, one per service, and composes the URL map. Adding a new service means adding an entry to the shared LB’s host_rules input; the actual host-rule construction still lives in the gxlb module.
The OPA/Conftest Policy
The policy is plain Rego. It inspects terraform plan -out=tfplan.binary converted to JSON via terraform show -json tfplan.binary, and fails on any resource of the forbidden types outside the module’s path:
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "google_compute_managed_ssl_certificate"
not startswith(resource.address, "module.")
msg := sprintf(
"Per-service managed cert '%s' is not allowed. Onboard via service-onboarding module.",
[resource.address])
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "google_certificate_manager_certificate"
resource.change.actions[_] == "create"
not contains(resource.address, "module.shared_certs")
msg := sprintf(
"Certificate Manager cert '%s' created outside the shared-certs module. Reuse existing wildcard.",
[resource.address])
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "google_compute_global_forwarding_rule"
not contains(resource.address, "module.gxlb")
msg := sprintf(
"New global forwarding rule '%s' outside the shared LB module. Use service-onboarding instead.",
[resource.address])
}
Three rules, each fails on a specific sprawl pattern. A PR that introduces any of them produces a clear error with a directive to use the shared path.
Wiring the Policy in CI
Terragrunt’s plan output feeds Conftest directly. The GitHub Actions workflow gate looks like:
- name: Terraform plan
run: |
terragrunt run-all plan -out=tfplan.binary \
--terragrunt-non-interactive
terragrunt run-all show -json tfplan.binary > tfplan.json
- name: Conftest policy check
uses: instrumenta/conftest-action@master
with:
files: tfplan.json
policy: policy/
The job runs on every PR. Fails on policy violation, produces the denial message as a CI error, developer fixes by either using the service-onboarding module or adding an explicit exemption (a comment on the resource with a justification and an approver). Exemption handling is out of scope for this article, but the pattern is straightforward: a second rule that matches an sprawl-exempt = "<reason>" label on the resource and skips denial.
ArgoCD for the Runtime Side
Terraform handles the infra layer — LB, cert map, DNS, foundational stacks. ArgoCD handles the GKE layer — Deployments, Services, HTTPRoutes, Gateway configs. The clean split keeps the two systems’ blast radii separate: a broken HTTPRoute doesn’t break the LB; a broken URL map doesn’t break a Deployment.
Install ArgoCD on the Autopilot cluster from earlier in the series:
kubectl create namespace argocd
kubectl apply -n argocd -f \
https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
Wire it to a demo repo holding one ArgoCD Application per service under argocd/apps/:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: food-web
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/cfg-labs/gcp-shared-traffic-demo.git
targetRevision: HEAD
path: apps/article-05/cfg-demo
destination:
server: https://kubernetes.default.svc
namespace: cfg-demo
syncPolicy:
automated:
prune: true
selfHeal: true
A new service gets: a Terragrunt directory with the service-onboarding module call (infra), plus an ArgoCD Application CR in the demo repo pointing at the service’s K8s manifests (runtime). Two files touched per service. Everything else — cert, LB, DNS zone, CAA policy, monitoring, runbook — is already shared.
App-of-Apps
Instead of one Application per service, one root Application that manages every service’s Application. ArgoCD calls this app-of-apps. The root lives in argocd/bootstrap/:
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: platform-apps
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/cfg-labs/gcp-shared-traffic-demo
targetRevision: HEAD
path: argocd/apps
directory:
recurse: true
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
prune: true
Bootstrap the root once. Every new service added to argocd/apps/ gets picked up by the root, synced, materialized. Onboarding collapses to: add a Terragrunt file, add an ArgoCD Application manifest, commit. CI checks the Terragrunt plan against the OPA policy; if green, merging deploys.
Rolling Back
Terraform rollback: revert the PR, run terragrunt apply. The service-onboarding module removes the backend service, URL-map host rule, and DNS record. Cert map stays intact because the module never touched it.
ArgoCD rollback: revert the commit in the GitOps repo. Auto-sync picks up the revert, removes the Deployment/Service/HTTPRoute. Rollback latency is measured in the ArgoCD sync interval (default 3 minutes).
Both rollback paths are proportional: you removed one service, the blast radius of the rollback is exactly one service. This is the operational property the consolidation pattern buys.
Publishing the Module
Internal Terraform registry or a Git-sourced shared modules repo — both work. The pattern for shared modules:
- Repo:
github.com/<org>/infra-modules - Tag releases with SemVer:
v1.0.0,v1.1.0, etc. - Every module has a
versions.tflocking the provider versions - Every module ships with a
README.mdshowing the minimal Terragrunt call - Breaking changes bump the major version; additive changes bump the minor; bug fixes bump the patch
Consumers pin the version in their Terragrunt source line:
terraform {
source = "git::https://github.com/c4geeks/infra-modules.git//service-onboarding?ref=v1.2.0"
}
Pin rigorously. An unpinned module gets the default branch, which turns a random commit into a production change for every consumer. Pin to tags, bump deliberately.
CI Gate Stacking
OPA/Conftest is one layer. A complete CI gate for this pattern includes:
- tflint: catches provider-specific anti-patterns and typos
- terraform validate + fmt: syntax and style
- OPA/Conftest: the policy rules above
- tfsec or checkov: generic security misconfiguration scan
- Drift check:
terragrunt planagainst prod on merge, post to Slack if non-empty
The policy layer is the one that enforces the consolidation outcome specifically. The others catch generic infra mistakes that would exist regardless of the consolidation pattern.
What This Makes Possible
Before this article: “consolidation is a great idea, please remember to do it.” After this article: consolidation is the default path and non-consolidated PRs fail CI. That’s the step-change that makes the outcome durable. Every article before this one was optional improvement; this one is the guardrail that stops regression.
The next article in this series is the capstone: a full demo app running on every module built so far, with a zero-incident rotation demonstrated live. That’s where every individual outcome from this consolidation plan plays together as a single story.
Cleanup
The service-onboarding module is a reusable asset, not ephemeral infra. Leave it in version control. The ArgoCD install on the cluster can be destroyed with kubectl delete namespace argocd at session end; reinstall takes under a minute for the next session. The root app-of-apps Application cascades a delete of every child Application when you kubectl delete it, so the order matters: root first, then namespace.