CtrlK
BlogDocsLog inGet started
Tessl Logo

iac-terraform

Infrastructure as Code with Terraform and Terragrunt. Use this skill whenever the user mentions Terraform, Terragrunt, HCL, or infrastructure as code. Triggers include writing or reviewing .tf files, creating reusable modules, debugging terraform plan/apply errors, managing remote state and locks, fixing state drift, setting up CI/CD for Terraform, scaffolding new modules, validating module structure, and implementing Terragrunt DRY patterns across environments.

94

Quality

92%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Infrastructure as Code - Terraform & Terragrunt

Comprehensive guidance for infrastructure as code using Terraform and Terragrunt, from development through production deployment.

Core Workflows

1. New Infrastructure Development

Workflow Decision Tree:

Is this reusable across environments/projects?
├─ Yes → Create a Terraform module
│   └─ See "Creating Terraform Modules" below
└─ No → Create environment-specific configuration
    └─ See "Environment Configuration" below

Creating Terraform Modules

When building reusable infrastructure:

  1. Scaffold new module with script:
python3 scripts/init_module.py my-module-name

This automatically creates:

  • Standard module file structure
  • Template files with proper formatting
  • Examples directory
  • README with documentation
  1. Use module template structure:

    • See assets/templates/MODULE_TEMPLATE.md for complete structure
    • Required files: main.tf, variables.tf, outputs.tf, versions.tf, README.md
    • Recommended: examples/ directory with working examples
  2. Follow module best practices:

    • Single responsibility - one module, one purpose
    • Sensible defaults for optional variables
    • Complete descriptions for all variables and outputs
    • Input validation using validation blocks
    • Mark sensitive values with sensitive = true
  3. Validate module:

python3 scripts/validate_module.py /path/to/module

This checks for:

  • Required files present
  • Variables have descriptions and types
  • Outputs have descriptions
  • README exists and is complete
  • Naming conventions followed
  • Sensitive values properly marked
  1. Test module:
cd examples/complete
terraform init
terraform plan
  1. Document module:
    • Use terraform-docs to auto-generate: terraform-docs markdown . > README.md
    • Include usage examples
    • Document all inputs and outputs

Key Module Patterns:

See references/best_practices.md "Module Design" section for:

  • Composability patterns
  • Variable organization
  • Output design
  • Module versioning strategies

Environment Configuration

For environment-specific infrastructure:

  1. Structure by environment:
environments/
├── dev/
├── staging/
└── prod/
  1. Use consistent file organization:
environment/
├── main.tf           # Resource definitions
├── variables.tf      # Variable declarations
├── terraform.tfvars  # Default values (committed)
├── secrets.auto.tfvars  # Sensitive values (.gitignore)
├── backend.tf        # State configuration
├── outputs.tf        # Output values
└── versions.tf       # Version constraints
  1. Reference modules:
module "vpc" {
  source = "git::https://github.com/company/terraform-modules.git//vpc?ref=v1.2.0"
  
  name        = "${var.environment}-vpc"
  vpc_cidr    = var.vpc_cidr
  environment = var.environment
}

2. State Management & Inspection

When to inspect state:

  • Before major changes
  • Investigating drift
  • Debugging resource issues
  • Auditing infrastructure

Inspect state and check health:

# List all managed resources
terraform state list

# Show detailed state for a specific resource
terraform state show <resource_address>

# Show full state summary (all resources, outputs, providers)
terraform show

Check for drift:

# Exit code 0 = no changes, 1 = error, 2 = drift detected
terraform plan -detailed-exitcode

State operations:

# List all resources
terraform state list

# Show specific resource
terraform state show aws_instance.web

# Remove from state (doesn't destroy)
terraform state rm aws_instance.web

# Move/rename resource
terraform state mv aws_instance.web aws_instance.web_server

# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0

State best practices: See references/best_practices.md "State Management" section for:

  • Remote backend setup (S3 + DynamoDB)
  • State file organization strategies
  • Encryption and security
  • Backup and recovery procedures

3. Standard Terraform Workflow

# 1. Initialize (first time or after module changes)
terraform init

# 2. Format code
terraform fmt -recursive

# 3. Validate syntax
terraform validate

# 4. Plan changes (always review!)
terraform plan -out=tfplan

# 5. Apply changes
terraform apply tfplan

# 6. Verify outputs
terraform output

With Terragrunt:

# Run for single module
terragrunt plan
terragrunt apply

# Run for all modules in directory tree
terragrunt run-all plan
terragrunt run-all apply

4. Troubleshooting Issues

When encountering errors:

  1. Read the complete error message - Don't skip details

  2. Consult references/troubleshooting.md which covers:

    • State lock errors
    • State drift/corruption
    • Provider authentication failures
    • Resource errors (already exists, dependency errors, timeouts)
    • Module source issues
    • Terragrunt-specific issues (dependency cycles, hooks)
    • Performance problems
  3. Enable debug logging if needed:

export TF_LOG=DEBUG
export TF_LOG_PATH=terraform-debug.log
terraform plan
  1. Isolate the problem:
# Test specific resource
terraform plan -target=aws_instance.web
terraform apply -target=aws_instance.web
  1. Common quick fixes:

State locked (full resolution guide: references/troubleshooting.md → State Lock Error):

# Verify no one else running, then:
terraform force-unlock <lock-id>

Provider cache issues:

rm -rf .terraform
terraform init -upgrade

Module cache issues:

rm -rf .terraform/modules
terraform init

5. Code Review & Quality

Before committing:

  1. Format code:
terraform fmt -recursive
  1. Validate syntax:
terraform validate
  1. Lint with tflint:
tflint --module
  1. Security scan with checkov:
checkov -d .
  1. Validate modules:
python3 scripts/validate_module.py modules/vpc
  1. Generate documentation:
terraform-docs markdown modules/vpc > modules/vpc/README.md

Review checklist:

  • All variables have descriptions
  • Sensitive values marked as sensitive
  • Outputs have descriptions
  • Resources follow naming conventions
  • No hardcoded values (use variables)
  • README is complete and current
  • Examples directory exists and works
  • Version constraints specified
  • Security best practices followed

See references/best_practices.md for comprehensive guidelines.

Terragrunt Patterns

Project Structure

terragrunt-project/
├── terragrunt.hcl              # Root config
├── account.hcl                 # Account-level vars
├── region.hcl                  # Region-level vars
└── environments/
    ├── dev/
    │   ├── env.hcl            # Environment vars
    │   └── us-east-1/
    │       ├── vpc/
    │       │   └── terragrunt.hcl
    │       └── eks/
    │           └── terragrunt.hcl
    └── prod/
        └── us-east-1/
            ├── vpc/
            └── eks/

Dependency Management

# In eks/terragrunt.hcl
dependency "vpc" {
  config_path = "../vpc"
  
  # Mock outputs for plan/validate
  mock_outputs = {
    vpc_id         = "vpc-mock"
    subnet_ids     = ["subnet-mock"]
  }
  mock_outputs_allowed_terraform_commands = ["validate", "plan"]
}

inputs = {
  vpc_id     = dependency.vpc.outputs.vpc_id
  subnet_ids = dependency.vpc.outputs.private_subnet_ids
}

Common Patterns

See assets/templates/MODULE_TEMPLATE.md for complete Terragrunt configuration templates including:

  • Root terragrunt.hcl with provider generation
  • Remote state configuration
  • Module-level terragrunt.hcl patterns
  • Dependency handling

Reference Documentation

  • references/best_practices.md — Project structure, state management, module design, security, CI/CD integration
  • references/troubleshooting.md — State lock errors, drift, provider issues, resource errors, Terragrunt-specific problems
  • references/cost_optimization.md — Right-sizing, Spot/RI strategies, storage optimization, cost tagging, multi-cloud

CI/CD Workflows

Ready-to-use templates in assets/workflows/:

TemplatePlatformFeatures
github-actions-terraform.ymlGitHub ActionsValidation, TFLint, Checkov, plan on PRs, apply on main, OIDC
github-actions-terragrunt.ymlGitHub ActionsChanged module detection, parallel planning, dependency-aware apply
gitlab-ci-terraform.ymlGitLab CIMulti-stage pipeline, artifact management, manual gates

Scripts

ScriptPurposeUsage
init_module.pyScaffold new module with standard structurepython3 scripts/init_module.py <name> [--path ./modules] [--json]
validate_module.pyValidate module against best practicespython3 scripts/validate_module.py <path>

Assets

  • templates/MODULE_TEMPLATE.md — Complete module template with file structure, examples, and Terragrunt configs

Quick Reference

Essential Commands

# Initialize
terraform init
terraform init -upgrade  # Update providers

# Validate
terraform validate
terraform fmt -recursive

# Plan
terraform plan
terraform plan -out=tfplan

# Apply
terraform apply
terraform apply tfplan
terraform apply -auto-approve  # CI/CD only

# State
terraform state list
terraform state show <resource>
terraform state rm <resource>
terraform state mv <old> <new>

# Import
terraform import <resource_address> <resource_id>

# Destroy
terraform destroy
terraform destroy -target=<resource>

# Outputs
terraform output
terraform output <output_name>

Terragrunt Commands

# Single module
terragrunt init
terragrunt plan
terragrunt apply

# All modules
terragrunt run-all plan
terragrunt run-all apply
terragrunt run-all destroy

# With specific modules
terragrunt run-all apply --terragrunt-include-dir vpc --terragrunt-include-dir eks
Repository
ahmedasmar/devops-claude-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.