One of the most common challenges in infrastructure management is running multiple environments (dev, staging, production) without drowning in duplicated configuration. Terraform solves this elegantly when structured correctly. Here's the approach we've refined over two years of managing AWS infrastructure for e-commerce platforms.
The Problem with Copy-Paste Infrastructure
The naive approach is to create separate Terraform directories for each environment. This works initially but quickly becomes a maintenance nightmare:
- A change needs to be applied to every environment manually
- Environments drift apart over time
- No guarantee that what works in staging will work in production
- Reviewing infrastructure changes becomes tedious
If your staging environment doesn't match production, it's not actually testing anything. It's just giving you a false sense of confidence.
Project Structure
We use a modular structure with environment-specific variable files. One codebase, multiple configurations:
infrastructure/
modules/
networking/
main.tf
variables.tf
outputs.tf
compute/
main.tf
variables.tf
outputs.tf
database/
main.tf
variables.tf
outputs.tf
environments/
dev.tfvars
staging.tfvars
production.tfvars
main.tf
variables.tf
outputs.tf
backend.tf
versions.tf
Module Design
Each module encapsulates a logical group of resources. The key principle: modules should be environment-agnostic. They accept parameters and produce outputs, but never contain environment-specific values.
# modules/networking/main.tf
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.common_tags, {
Name = "${var.project}-${var.environment}-vpc"
})
}
resource "aws_subnet" "private" {
count = length(var.private_subnet_cidrs)
vpc_id = aws_vpc.main.id
cidr_block = var.private_subnet_cidrs[count.index]
availability_zone = var.availability_zones[count.index]
tags = merge(var.common_tags, {
Name = "${var.project}-${var.environment}-private-${count.index + 1}"
Tier = "private"
})
}
Environment Variables
Each environment gets its own .tfvars file that defines the specific values:
# environments/production.tfvars
environment = "production"
region = "ap-southeast-2"
# Networking
vpc_cidr = "10.0.0.0/16"
private_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnet_cidrs = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
# Compute
instance_type = "m6i.xlarge"
min_capacity = 3
max_capacity = 12
desired_capacity = 3
# Database
db_instance_class = "db.r6g.xlarge"
db_multi_az = true
db_backup_retention = 30
# environments/dev.tfvars
environment = "dev"
region = "ap-southeast-2"
# Networking
vpc_cidr = "10.10.0.0/16"
private_subnet_cidrs = ["10.10.1.0/24", "10.10.2.0/24"]
public_subnet_cidrs = ["10.10.101.0/24", "10.10.102.0/24"]
# Compute
instance_type = "t3.medium"
min_capacity = 1
max_capacity = 2
desired_capacity = 1
# Database
db_instance_class = "db.t3.medium"
db_multi_az = false
db_backup_retention = 7
State Isolation
State isolation between environments is critical. We use S3 backend with DynamoDB locking, with separate state files per environment:
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "infrastructure/terraform.tfstate"
region = "ap-southeast-2"
dynamodb_table = "terraform-locks"
encrypt = true
}
}
We use Terraform workspaces to separate state:
# Switch to production workspace
terraform workspace select production
# Plan with production variables
terraform plan -var-file=environments/production.tfvars
# Apply
terraform apply -var-file=environments/production.tfvars
CI/CD Integration
Infrastructure changes go through the same review process as application code. Our GitHub Actions workflow:
- On Pull Request --
terraform planruns and posts the output as a PR comment - Review -- team reviews the plan diff before approving
- On Merge --
terraform applyruns automatically for the target environment
plan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- name: Terraform Init
run: terraform init
- name: Select Workspace
run: terraform workspace select $
- name: Terraform Plan
id: plan
run: terraform plan -var-file=environments/$.tfvars -no-color
continue-on-error: true
- name: Comment Plan on PR
uses: actions/github-script@v7
with:
script: |
const plan = `$`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `### Terraform Plan\n\`\`\`\n${plan}\n\`\`\``
});
Lessons Learned
After managing this setup across multiple projects, here's what I wish I knew from the start:
- Start with modules early -- refactoring a flat Terraform config into modules later is painful
- Use consistent tagging -- every resource should have
Environment,Project, andManagedBytags - Version pin everything -- providers, modules, and Terraform itself
- Use
terraform fmtandterraform validatein CI -- catches issues early - Document your modules -- future you will thank present you
Infrastructure as Code isn't just about automation -- it's about making your infrastructure reviewable, testable, and reproducible. The initial setup takes effort, but the payoff in reliability and team confidence is worth it.
Comments