Name: pantheon-ai/cfn-template-compare
Rating: 95.8 (1 reviews)
Author: pantheon-ai

pantheon-ai/cfn-template-compare

Compares deployed CloudFormation templates with locally synthesized CDK templates to detect drift, validate changes, and ensure consistency before deployment. Use when the user wants to compare CDK output with a deployed stack, check for infrastructure drift, run a pre-deployment validation, audit IAM or security changes, investigate a failing deployment, or perform a 'cdk diff'-style review. Triggered by phrases like 'compare templates', 'check for drift', 'cfn drift', 'stack comparison', 'infrastructure drift detection', 'safe to deploy', or 'what changed in my CDK stack'.

1.08x

Quality

93%

Does it follow best practices?

Impact

100%

1.08x

Average score across 5 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the agent knows to switch from line diff to hierarchical comparison for large templates and understands the threshold for this decision.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Problem threshold identified",
      "description": "large-template-strategy.md mentions a specific line count threshold (e.g., >5000 lines) when diff becomes unmanageable",
      "max_score": 15
    },
    {
      "name": "Hierarchical approach recommended",
      "description": "Document explicitly recommends hierarchical comparison instead of line diff for large templates",
      "max_score": 15
    },
    {
      "name": "Structure comparison first",
      "description": "Comparison steps include checking top-level structure (keys) first",
      "max_score": 10
    },
    {
      "name": "Resource count comparison",
      "description": "Comparison steps include checking resource counts (jq '.Resources | length')",
      "max_score": 10
    },
    {
      "name": "Added/removed resources",
      "description": "Comparison steps include identifying which resources were added/removed",
      "max_score": 10
    },
    {
      "name": "Avoid line diff",
      "description": "large-template-compare.sh uses jq, comm, or diff with process substitution rather than raw 'diff file1 file2'",
      "max_score": 12
    },
    {
      "name": "Summarized output",
      "description": "Script commands produce concise output (counts, lists) rather than full template diffs",
      "max_score": 10
    },
    {
      "name": "Decision criteria clear",
      "description": "Document explains when to use hierarchical vs line diff (based on size/complexity)",
      "max_score": 10
    },
    {
      "name": "Security focused subset",
      "description": "Strategy mentions focusing on security-sensitive changes (IAM, CDK Nag) rather than all resources",
      "max_score": 8
    }
  ]
}