Production-grade platform engineering handbook — Kubernetes, Terraform, Flux CD, GitHub Actions, AWS, and more.
67
84%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Status: Stable
Reference implementations for Terraform module patterns and best practices. eks-cluster/ is a runnable production EKS module. multi-env-structure/ is a reference layout. Testing and CI/CD sections point to handbook patterns.
eks-cluster/ - Production-ready EKS cluster module
Use case: Reusable EKS cluster with opinionated defaults
Pattern: Composition module wrapping AWS provider resources
Components:
multi-env-structure/ - Repository layout for multiple environments
Use case: Managing dev, staging, and production environments
Pattern: Separate state per environment with shared modules
Components:
modules/ - Reusable componentslive/ - Environment-specific configurationsHandbook pattern: testing strategies for Terraform modules
Use case: Validating module behavior
Pattern: Native tests, examples, and validation
Components:
Handbook pattern: complete Terraform CI/CD workflow
Use case: Automated plan, review, and apply
Pattern: GitHub Actions with OIDC and protected environments
Components:
See ../github-actions/terraform-cicd.yml for the committed workflow example.
# Initialize
terraform init
# Format code
terraform fmt -recursive
# Validate syntax
terraform validate
# Plan changes
terraform plan -out=tfplan
# Apply changes
terraform apply tfplan
# Show current state
terraform show
# Import existing resource
terraform import aws_s3_bucket.example my-bucket
# Remove resource from state without destroying
terraform state rm aws_s3_bucket.examplemodule-name/
├── README.md # Usage documentation
├── main.tf # Primary resources
├── variables.tf # Input variables
├── outputs.tf # Output values
├── versions.tf # Provider version constraints
├── examples/
│ ├── basic/ # Simple example
│ └── complete/ # Full-featured example
└── tests/
└── basic.tftest.hcl # Native tests# Good - descriptive and scoped
variable "cluster_name" {
description = "Name of the EKS cluster"
type = string
}
# Bad - too generic
variable "name" {
description = "Name"
type = string
}# Good - includes resource type context
output "cluster_endpoint" {
description = "EKS cluster endpoint URL"
value = aws_eks_cluster.this.endpoint
}
# Bad - ambiguous
output "endpoint" {
value = aws_eks_cluster.this.endpoint
}variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "production"], var.environment)
error_message = "Environment must be dev, staging, or production"
}
}terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "production/eks/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}Separate state files by:
# List resources in state
terraform state list
# Show specific resource
terraform state show aws_eks_cluster.main
# Move resource to different state file
terraform state mv aws_s3_bucket.old aws_s3_bucket.new
# Pull state to local file
terraform state pull > terraform.tfstate.backup# ❌ Overly permissive
resource "aws_iam_policy" "bad" {
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = "s3:*"
Resource = "*"
}]
})
}
# ✅ Least privilege
resource "aws_iam_policy" "good" {
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"s3:GetObject",
"s3:ListBucket"
]
Resource = [
aws_s3_bucket.this.arn,
"${aws_s3_bucket.this.arn}/*"
]
}]
})
}# ❌ Never commit secrets
variable "database_password" {
default = "hardcoded-secret-bad"
}
# ✅ Use external secret management
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "production/database/password"
}
resource "aws_db_instance" "this" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
# ... other config
}See references/terraform.md for detailed patterns.
State lock timeout:
# Check who has the lock
aws dynamodb get-item \
--table-name terraform-locks \
--key '{"LockID":{"S":"my-state-bucket/path/terraform.tfstate-md5"}}'
# Force unlock (dangerous - verify no other process running)
terraform force-unlock <lock-id>Resource already exists:
# Import into state
terraform import aws_s3_bucket.example existing-bucket-nameDrift detection:
# Refresh state and show drift
terraform plan -refresh-only.claude-plugin
.github
commands
docs
examples
agent-self-improve
argocd
awesome-docs
aws
cloudfront
functions
lambda-edge
functions
azure
compliance
conventional-commits
datadog
llm-observability
demo
documentation
dora
dynatrace
fluxcd
github-actions
composite-actions
configure-cloud
db-migrate
docker-build-push
k8s-deploy
notify-slack
pr-comment
release-tag
security-scan
setup-env
setup-terraform
terraform-plan
helm
web-service
templates
kubernetes
kyverno
mcp
observability
openshift
pr-review
ownership
runtime-security
supply-chain
terraform
references
scripts
skills
platform-skills
tests