Cloud

Enforce GCP Cert Consolidation with Terraform and ArgoCD

Consolidation patterns that depend on good intentions decay fast. One PR at 5pm on a Friday that adds a google_compute_managed_ssl_certificate in a new project, and the cert inventory starts sprawling again. The only durable fix is to make the shared path the path of least resistance, and to block the sprawl path with CI. This article publishes a reusable Terraform module that onboards a new service onto the shared LB in a 5-line Terragrunt file, plus an OPA/Conftest policy that fails terraform plan when someone tries to create a per-service cert outside the module. The two together turn the consolidation outcome from “an ongoing initiative” into “the default, enforced.”

Original content from computingforgeeks.com - post 166204

Tested April 2026 with Terragrunt 0.69, OpenTofu 1.10, Conftest 0.54, google provider 6.12. Module is published under infra-gcp/modules/service-onboarding/ and stands alongside the earlier modules in this series.

What the Module Replaces

Every service onboarding used to be: create a DNS A record, issue a ManagedCertificate, create an Ingress with its own static IP, wait for the cert to provision. Four to six files per service, spread across two or three repositories, with enough shape for a typo to produce a broken cert three days later.

After consolidation, the shared LB + cert map from the previous articles already covers every *.cfg-lab hostname. A new service needs: a hostname registered in DNS pointing at the shared LB, a URL-map host rule routing to the new backend, and a pointer to the backend itself. The cert is already there. The LB is already there. Onboarding is glue, not new infrastructure.

The service-onboarding module is exactly that glue.

Module Inputs and Shape

The target API looks like this:

module "market_onboarding" {
  source = "../../modules/service-onboarding"

  name          = "market"
  hostname      = "market.cfg-lab.computingforgeeks.com"
  dns_zone_name = "cfg-lab"
  shared_lb_ip  = dependency.gxlb.outputs.ip_address
  url_map_name  = dependency.gxlb.outputs.url_map_name
  backend_type  = "cloud_run" # or "gke_neg", "internet_neg", "backend_bucket"
  backend_ref   = google_cloud_run_service.market.name
  environment   = "lab"
}

One module call. Five primary inputs. The output is a DNS A record, a URL-map host rule, a backend resource of the requested type, and a URL-map path matcher routing the hostname at the backend. The cert map and LB are consumed by dependency, not created here; the module enforces the shared path by not exposing a cert-creation input at all.

The Module Body (Core)

The module’s main file wires a DNS A record pointing at the shared LB, a dispatched backend based on the requested backend type, and a backend service that the shared URL map consumes. Labels tie every resource back to the module for auditing.

resource "google_dns_record_set" "a" {
  project      = var.project_id
  managed_zone = var.dns_zone_name
  name         = "${var.hostname}."
  type         = "A"
  ttl          = 300
  rrdatas      = [var.shared_lb_ip]
}

locals {
  backend_id = lookup({
    "cloud_run"     = google_compute_region_network_endpoint_group.cloud_run[0].id
    "gke_neg"       = var.backend_ref
    "internet_neg"  = google_compute_global_network_endpoint_group.internet[0].id
    "backend_bucket"= google_compute_backend_bucket.bucket[0].id
  }, var.backend_type)
}

resource "google_compute_backend_service" "this" {
  project = var.project_id
  name    = "${var.name}-backend"

  backend {
    group = local.backend_id
  }

  load_balancing_scheme = "EXTERNAL_MANAGED"

  labels = merge(var.common_labels, {
    service     = var.name
    environment = var.environment
    managed_by  = "service-onboarding-module"
  })
}

Three things are deliberately missing. No google_compute_managed_ssl_certificate. No google_compute_target_https_proxy. No google_compute_global_forwarding_rule. The module cannot produce per-service cert sprawl because the resources that produce it are not in the module. If someone wants to fork the module and add those, the OPA policy blocks the PR.

URL Map Host Rule Registration

The shared LB’s URL map holds the routing logic. The module can’t overwrite the URL map (it’s managed by the gxlb stack), but it can export a host-rule stub that the gxlb module consumes:

output "url_map_host_rule" {
  value = {
    hosts        = [var.hostname]
    path_matcher = var.name
    service_id   = google_compute_backend_service.this.id
  }
}

The gxlb module iterates over a map of these stubs, one per service, and composes the URL map. Adding a new service means adding an entry to the shared LB’s host_rules input; the actual host-rule construction still lives in the gxlb module.

The OPA/Conftest Policy

The policy is plain Rego. It inspects terraform plan -out=tfplan.binary converted to JSON via terraform show -json tfplan.binary, and fails on any resource of the forbidden types outside the module’s path:

package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_compute_managed_ssl_certificate"
  not startswith(resource.address, "module.")
  msg := sprintf(
    "Per-service managed cert '%s' is not allowed. Onboard via service-onboarding module.",
    [resource.address])
}

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_certificate_manager_certificate"
  resource.change.actions[_] == "create"
  not contains(resource.address, "module.shared_certs")
  msg := sprintf(
    "Certificate Manager cert '%s' created outside the shared-certs module. Reuse existing wildcard.",
    [resource.address])
}

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_compute_global_forwarding_rule"
  not contains(resource.address, "module.gxlb")
  msg := sprintf(
    "New global forwarding rule '%s' outside the shared LB module. Use service-onboarding instead.",
    [resource.address])
}

Three rules, each fails on a specific sprawl pattern. A PR that introduces any of them produces a clear error with a directive to use the shared path.

Wiring the Policy in CI

Terragrunt’s plan output feeds Conftest directly. The GitHub Actions workflow gate looks like:

- name: Terraform plan
  run: |
    terragrunt run-all plan -out=tfplan.binary \
      --terragrunt-non-interactive
    terragrunt run-all show -json tfplan.binary > tfplan.json

- name: Conftest policy check
  uses: instrumenta/conftest-action@master
  with:
    files: tfplan.json
    policy: policy/

The job runs on every PR. Fails on policy violation, produces the denial message as a CI error, developer fixes by either using the service-onboarding module or adding an explicit exemption (a comment on the resource with a justification and an approver). Exemption handling is out of scope for this article, but the pattern is straightforward: a second rule that matches an sprawl-exempt = "<reason>" label on the resource and skips denial.

ArgoCD for the Runtime Side

Terraform handles the infra layer — LB, cert map, DNS, foundational stacks. ArgoCD handles the GKE layer — Deployments, Services, HTTPRoutes, Gateway configs. The clean split keeps the two systems’ blast radii separate: a broken HTTPRoute doesn’t break the LB; a broken URL map doesn’t break a Deployment.

Install ArgoCD on the Autopilot cluster from earlier in the series:

kubectl create namespace argocd
kubectl apply -n argocd -f \
  https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

Wire it to a demo repo holding one ArgoCD Application per service under argocd/apps/:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: food-web
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/cfg-labs/gcp-shared-traffic-demo.git
    targetRevision: HEAD
    path: apps/article-05/cfg-demo
  destination:
    server: https://kubernetes.default.svc
    namespace: cfg-demo
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

A new service gets: a Terragrunt directory with the service-onboarding module call (infra), plus an ArgoCD Application CR in the demo repo pointing at the service’s K8s manifests (runtime). Two files touched per service. Everything else — cert, LB, DNS zone, CAA policy, monitoring, runbook — is already shared.

App-of-Apps

Instead of one Application per service, one root Application that manages every service’s Application. ArgoCD calls this app-of-apps. The root lives in argocd/bootstrap/:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: platform-apps
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/cfg-labs/gcp-shared-traffic-demo
    targetRevision: HEAD
    path: argocd/apps
    directory:
      recurse: true
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true

Bootstrap the root once. Every new service added to argocd/apps/ gets picked up by the root, synced, materialized. Onboarding collapses to: add a Terragrunt file, add an ArgoCD Application manifest, commit. CI checks the Terragrunt plan against the OPA policy; if green, merging deploys.

Rolling Back

Terraform rollback: revert the PR, run terragrunt apply. The service-onboarding module removes the backend service, URL-map host rule, and DNS record. Cert map stays intact because the module never touched it.

ArgoCD rollback: revert the commit in the GitOps repo. Auto-sync picks up the revert, removes the Deployment/Service/HTTPRoute. Rollback latency is measured in the ArgoCD sync interval (default 3 minutes).

Both rollback paths are proportional: you removed one service, the blast radius of the rollback is exactly one service. This is the operational property the consolidation pattern buys.

Publishing the Module

Internal Terraform registry or a Git-sourced shared modules repo — both work. The pattern for shared modules:

  1. Repo: github.com/<org>/infra-modules
  2. Tag releases with SemVer: v1.0.0, v1.1.0, etc.
  3. Every module has a versions.tf locking the provider versions
  4. Every module ships with a README.md showing the minimal Terragrunt call
  5. Breaking changes bump the major version; additive changes bump the minor; bug fixes bump the patch

Consumers pin the version in their Terragrunt source line:

terraform {
  source = "git::https://github.com/c4geeks/infra-modules.git//service-onboarding?ref=v1.2.0"
}

Pin rigorously. An unpinned module gets the default branch, which turns a random commit into a production change for every consumer. Pin to tags, bump deliberately.

CI Gate Stacking

OPA/Conftest is one layer. A complete CI gate for this pattern includes:

  • tflint: catches provider-specific anti-patterns and typos
  • terraform validate + fmt: syntax and style
  • OPA/Conftest: the policy rules above
  • tfsec or checkov: generic security misconfiguration scan
  • Drift check: terragrunt plan against prod on merge, post to Slack if non-empty

The policy layer is the one that enforces the consolidation outcome specifically. The others catch generic infra mistakes that would exist regardless of the consolidation pattern.

What This Makes Possible

Before this article: “consolidation is a great idea, please remember to do it.” After this article: consolidation is the default path and non-consolidated PRs fail CI. That’s the step-change that makes the outcome durable. Every article before this one was optional improvement; this one is the guardrail that stops regression.

The next article in this series is the capstone: a full demo app running on every module built so far, with a zero-incident rotation demonstrated live. That’s where every individual outcome from this consolidation plan plays together as a single story.

Cleanup

The service-onboarding module is a reusable asset, not ephemeral infra. Leave it in version control. The ArgoCD install on the cluster can be destroyed with kubectl delete namespace argocd at session end; reinstall takes under a minute for the next session. The root app-of-apps Application cascades a delete of every child Application when you kubectl delete it, so the order matters: root first, then namespace.

Related Articles

Containers Run Teleport in Docker Container using Docker Compose Containers Install Kubernetes cluster in docker containers using Rancher K3d Containers Force Delete Evicted / Terminated Pods in Kubernetes AWS IAM Roles for Service Accounts (IRSA) on Amazon EKS: The Complete Guide

Leave a Comment

Press ESC to close