Infrastructure as Code Isn't Enough
Infrastructure as Code was a genuine leap forward. I remember what it was like before — hand-crafted servers, snowflakes all the way down, “it works on prod” as a punchline that was also deeply unfunny.
Terraform, Pulumi, CDK — these tools gave us reproducibility. And reproducibility is powerful.
But there’s a gap that IaC doesn’t fill, and I keep watching teams fall into it.
What IaC actually solves
IaC solves the declaration problem: “what should the infrastructure look like?” You write code, you run it, the cloud looks like your code. Revolutionary.
It also helps with the drift problem: if someone changes something manually, your IaC run will flag it (or overwrite it).
And it helps with the duplication problem: once you’ve codified a pattern (a Cloud Run service, a VPC configuration, a Cloud SQL instance), you can replicate it.
These are real wins. Do not under-appreciate them.
If you are still doing manual console clicks to provision production infrastructure in 2026, stop reading this and go set up Terraform first. Come back after.
What IaC doesn’t solve
IaC doesn’t tell you:
- Why this infrastructure exists
- Who is responsible for it
- When it should be decommissioned
- What it costs, and whether that cost is justified
- How it relates to the systems around it
These are organizational problems, not tooling problems. And they’re the problems I see most often when I’m called in to help a team whose infrastructure has grown messy.
The documentation gap
A Terraform file describes state. It does not describe intent.
I can read your main.tf and know that you have a Cloud Run service with 2 minimum instances, 4 vCPUs, and a specific image tag. I cannot know why minimum instances is 2 instead of 1 (cost vs. cold-start tradeoff?), why 4 vCPUs (memory-bound or CPU-bound workload?), or who decided this and when.
This matters when you’re oncall at 2am and need to make a change quickly. It matters when you’re offboarding someone who built a system. It matters when you’re doing a cost review and trying to decide what’s safe to scale down.
A terraform.tfvars file full of undocumented magic numbers is worse than no IaC at all, because it gives you false confidence that the infrastructure is understood.
The ownership gap
Related: IaC makes it easy to create infrastructure, but it doesn’t enforce accountability for it.
In a healthy platform team, every piece of infrastructure has an owner. Not a team — a person (or rotation) who can answer questions about it. IaC tools don’t track this. Your cloud billing console doesn’t track this. Most tagging strategies are aspirational rather than enforced.
The result is what I call “orphaned infrastructure”: resources that exist, cost money, and have no one who understands them.
I’ve seen $40,000/month GCP bills where 30% of the spend was on resources nobody could explain. That’s not a Terraform problem — that’s a process and ownership problem.
What to actually do about it
A few practices that help:
Tag everything at creation time, not retroactively. Include owner, team, service, environment, and an expires tag for anything temporary. Make untagged resource creation fail in your CI pipeline.
Write ADRs alongside your IaC. An Architecture Decision Record is just a short document explaining why you made a decision. Put them in the same repo as your Terraform. Future-you will be grateful.
Build a service catalog. Even a simple Notion page or Confluence space that lists services, their owners, their cloud resources, and their status is enormously valuable. GCP’s Service Catalog and tools like Backstage can help formalize this.
Review costs weekly. Not monthly, not quarterly. Weekly. At the team level, not just the finance level. Engineers should see the cost impact of their decisions in near-real-time.
GCP’s budget alerts are free and take 10 minutes to set up. There is no excuse not to have them on every project.
The punchline
IaC is necessary. It is not sufficient.
The teams that thrive with cloud infrastructure are the ones who treat it as a sociotechnical system — they have the tooling and the processes and the culture of ownership. Terraform alone gets you one leg of that stool.
Build all three legs. The stool will be more stable.