Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Use Terraform for:
Do not use Terraform for high-churn application runtime configuration that Flux or Argo CD can reconcile more safely inside the cluster.
Prefer this split:
modules/
networking/
kubernetes-cluster/
identity/
live/
aws/
prod/
staging/
azure/
prod/
staging/modules/ contains reusable abstractions with narrow scope.live/ contains environment or tenant compositions.terraform fmtterraform fmt rewrites Terraform configuration to canonical HCL style. Run it as a check in CI so formatting drift never reaches a PR review:
terraform fmt -check -recursive-check exits non-zero if any file would be changed. -recursive covers all subdirectories. Fix locally with:
terraform fmt -recursiveEnforce in pre-commit using the official hook:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/antonbabenko/pre-commit-terraform
rev: v1.96.1 # pin to a release tag
hooks:
- id: terraform_fmtterraform validateterraform validate checks configuration syntax and internal references without contacting any cloud API. It requires terraform init to have run first (providers must be installed).
terraform init -backend=false # skip real backend for CI
terraform validate-backend=false skips remote state configuration so validate works in CI without credentials. This catches type mismatches, missing required variables, and invalid references before a plan is attempted.
tflint enforces provider-specific rules that terraform validate cannot — deprecated instance types, invalid AMI filters, unsupported argument names per provider version.
Minimal config at repo root:
# .tflint.hcl
config {
call_module_type = "local"
}
plugin "aws" {
enabled = true
version = "0.38.0"
source = "github.com/terraform-linters/tflint-ruleset-aws"
}
plugin "azurerm" {
enabled = true
version = "0.27.0"
source = "github.com/terraform-linters/tflint-ruleset-azurerm"
}Run it per module:
tflint --init # download plugins (once)
tflint --recursive # lint all modules from repo rootKey rules to never suppress:
terraform_deprecated_index — catches list[0] style that breaks in Terraform 0.13+terraform_unused_declarations — variables and locals with no referencesterraform_required_version — missing required_version constraintaws_instance_invalid_type / azurerm_virtual_machine_invalid_vm_size — invalid resource sizesUse tfsec or checkov to catch misconfigurations before plan:
# tfsec (fast, Terraform-only)
tfsec . --minimum-severity HIGH
# checkov (broader, covers Terraform + other IaC)
checkov -d . --framework terraform --compact --quietBoth tools identify patterns like:
0.0.0.0/0Fail CI on HIGH and CRITICAL severity. Review MEDIUM findings in PR but do not block on them until a baseline is established.
fmt -check -recursive → validate → tflint → tfsec/checkov → plan → applyRun format and validate in parallel with lint and security checks — none of them require a remote API call. Gate the plan step until all four pass. Gate apply behind plan approval and protected environment.
Minimum CI for Terraform changes:
fmt and validate — syntax, style, reference integritytflint — provider-specific rule enforcementtfsec or checkov on HIGH+plan with reviewable outputapply through protected environmentsIf the task involves module quality, add tests or example validation. If the task involves platform rollout, focus on safe composition, state isolation, and promotion gates before writing module internals.
When a dynamic block's for_each condition depends on a sensitive nullable variable, none of the common single-block patterns work in Terraform 1.7:
| Pattern | Failure |
|---|---|
for_each = condition ? [1] : [] | list literal rejected in some nested contexts |
for_each = toset([var.secret]) | sensitive value cannot be a set element (used as key) |
for_each = { k = var.secret } | sensitive map value rejected |
for_each = { k = true } if condition | bool map rejected inside nested dynamic |
Reliable pattern: two sibling dynamic blocks
Use one block for each branch of the condition, with a literal non-sensitive map key:
# Branch A — secret not set
dynamic "origin" {
for_each = local.use_alb && var.cloudfront_origin_secret == null ? { alb = true } : {}
content {
domain_name = var.custom_origin_domain
origin_id = local.alb_origin_id
custom_origin_config { http_port = 80; https_port = 443; origin_protocol_policy = "https-only"; origin_ssl_protocols = ["TLSv1.2"] }
}
}
# Branch B — secret present
dynamic "origin" {
for_each = local.use_alb && var.cloudfront_origin_secret != null ? { alb = true } : {}
content {
domain_name = var.custom_origin_domain
origin_id = local.alb_origin_id
custom_origin_config { http_port = 80; https_port = 443; origin_protocol_policy = "https-only"; origin_ssl_protocols = ["TLSv1.2"] }
custom_header {
name = "X-CloudFront-Secret"
value = var.cloudfront_origin_secret
}
}
}The null-check (== null / != null) moves the sensitive comparison out of the for_each value entirely, leaving only a plain bool map key { alb = true }. Terraform accepts this in all nesting contexts.
Apply this pattern any time a block is conditionally rendered based on whether a sensitive nullable variable is set.
.claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests