Skills for building AEM Edge Delivery Services sites — block development, content modeling, code review, testing, and page import.
82
76%
Does it follow best practices?
Impact
88%
1.04xAverage score across 6 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent produces structured acceptance criteria for a new Pricing Table block without writing code. Covers functional requirements, edge cases, responsive behavior, author experience, and testable definition of done.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Functional requirements listed",
"description": "Covers rendering of all 3 tiers (Basic, Pro, Enterprise), feature comparison list, pricing display, CTA buttons per tier, and visually highlighted 'recommended' Pro tier",
"max_score": 15
},
{
"name": "Pricing toggle addressed",
"description": "Acceptance criteria mention the monthly/annual pricing toggle behavior and state management (e.g., which is the default state, how prices update on toggle)",
"max_score": 10
},
{
"name": "Edge cases identified",
"description": "Addresses edge cases such as empty feature rows, missing CTA, uneven feature counts across tiers, very long tier names, or missing pricing for one billing period",
"max_score": 15
},
{
"name": "Responsive behavior specified",
"description": "Mobile stacking and desktop side-by-side layout are explicitly defined; tablet behavior is mentioned or addressed",
"max_score": 12
},
{
"name": "Author experience documented",
"description": "Describes what content inputs authors provide (NOT table/cell structure or implementation details); distinguishes required vs optional fields",
"max_score": 12
},
{
"name": "No premature implementation details",
"description": "Does NOT prescribe HTML structure, CSS classes, or JS implementation; stays at the requirements/acceptance-criteria level",
"max_score": 10
},
{
"name": "Definition of done is testable",
"description": "Completion conditions are specific and verifiable (e.g., 'all 3 tiers render with correct pricing'), not vague (e.g., 'block works correctly')",
"max_score": 10
},
{
"name": "Ambiguities flagged",
"description": "Identifies open questions or ambiguities (e.g., what happens when a tier has no CTA? Is the toggle state persisted across page navigation? What currency format is used?)",
"max_score": 8
},
{
"name": "Structured format used",
"description": "Output uses a structured template with clear sections such as functional requirements, edge cases, responsive behavior, author experience, and definition of done",
"max_score": 8
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
skills
analyze-and-plan
block-collection-and-party
block-inventory
building-blocks
code-review
content-driven-development
content-modeling
docs-search
find-test-content
generate-import-html
identify-page-structure
page-decomposition
page-import
preview-import
scrape-webpage
testing-blocks