Zayn | Cloud Engineer & AI DevOps

Why Landing Zones Matter

A "Landing Zone" is a pre-configured, secure environment where you deploy your workloads. Think of it as the foundation of your cloud infrastructure - get it wrong, and everything built on top inherits those problems.

Too many organizations rely on "ClickOps" - manually configuring infrastructure through the cloud console. This leads to:

Configuration drift: Prod doesn't match staging doesn't match dev
Security gaps: Forgotten firewall rules, overly permissive IAM
No audit trail: Who changed what, when?
Slow recovery: Can you rebuild from scratch if needed?

The Architecture

My landing zone is built in layers, each managed by modular Terraform:

1. Network Layer

module "network" {
  source = "./modules/network"
  
  project_id    = var.project_id
  region        = var.region
  
  # Zero Trust: No default routes to internet
  enable_private_google_access = true
}

Custom VPC: No default VPC, no surprises
Private Subnets: Separate ranges for nodes, pods, and services
Cloud NAT: Controlled egress for patching and updates
Deny-All Firewall: Nothing in, nothing out by default

2. Compute Layer (GKE Autopilot)

resource "google_container_cluster" "autopilot" {
  name     = "secure-cluster"
  location = var.region
  
  enable_autopilot = true
  
  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block = "10.0.0.0/28" // Private Control Plane Range
  }
}

GKE Autopilot handles node management, scaling, and security hardening automatically. With enable_private_nodes = true, the worker nodes have no public IPs - they're completely isolated from the internet.

3. Orchestration Layer

The live/dev directory ties everything together, managing API enablement and module composition. This separation makes it easy to create identical staging and production environments.

Zero Trust Networking

The core principle: never trust, always verify. The VPC is locked down with a deny-all egress rule. Traffic can only leave through Cloud NAT, and only to specific destinations.

resource "google_compute_firewall" "deny_all_egress" {
  name      = "deny-all-egress"
  network   = google_compute_network.vpc.name
  direction = "EGRESS"
  priority  = 65535
  
  deny {
    protocol = "all"
  }
  
  destination_ranges = ["0.0.0.0/0"]
}

Automated State Management

One pain point with Terraform is bootstrapping the remote state backend. You need a GCS bucket to store state, but Terraform can't create that bucket in the same run that uses it.

I solved this with a shell script that:

Checks if the state bucket exists
Creates it with versioning enabled if not
Initializes Terraform with the backend config

This makes the entire deployment truly "one command" from a fresh GCP project.

Key Takeaways

Infrastructure as Code is non-negotiable for production environments
Private nodes + Cloud NAT give you internet access without exposure
Modular Terraform makes environments reproducible and composable
Zero Trust isn't just a buzzword - it's a default-deny security posture

Secure GCP Landing Zone: Zero Trust Infrastructure with Terraform