Design, build, or audit professional UI design systems across strategy, product language, foundations, tokens, components, patterns, accessibility, content, Figma/code libraries, documentation, QA, governance, adoption, measurement, theming, releases, and migration. Use when the user wants to create a design-system blueprint, review an existing design system, fix design-system drift, plan Figma/code parity, define token or component architecture, evaluate accessibility and governance maturity, or sequence design-system adoption and migration work.
100
100%
Does it follow best practices?
Impact
100%
1.75xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent follows the blueprint workflow for a new design system, including: completing the First Actions checklist, selecting the correct mode, producing all required sections, defining decision-grade principles with test questions and tradeoffs, using a layered token model, treating components as public APIs, framing ROI as an estimate, and ordering work correctly before building components.",
"type": "weighted_checklist",
"checklist": [
{
"name": "First Actions checklist",
"description": "The output includes a filled First Actions checklist with Mode, Inputs Checked, and Assumptions and Evidence Gaps sections completed before recommendations begin.",
"max_score": 10
},
{
"name": "Mode stated as blueprint",
"description": "The mode is explicitly identified as 'blueprint' (not 'audit') with a brief reason given.",
"max_score": 5
},
{
"name": "Maturity target classified",
"description": "The blueprint classifies a maturity target stage (MVP system, Scaling system, Enterprise platform, or Audit/recovery) and uses it to shape scope and emphasis.",
"max_score": 5
},
{
"name": "Required blueprint sections present",
"description": "The blueprint contains at least 15 of the 20 required sections: Executive Summary, Scope and Assumptions, Strategy and Operating Model, Design Principles and Product Language, Foundations and Token Architecture, Component Architecture, Pattern and Flow System, Accessibility and Inclusive Design, Content Design and UX Writing, Figma Library Architecture, Code Library Architecture, Documentation and Education, Quality Assurance and Testing, Governance and Lifecycle, Adoption and Change Management, Measurement and Health Analytics, Multi-Brand and Theming Strategy, Releases, Maintenance, and Migration, Roadmap, Open Decisions.",
"max_score": 10
},
{
"name": "Decision-grade principles",
"description": "The blueprint defines between 3 and 5 design principles, each stated in a way that can be used to make or evaluate product decisions (not vague words like 'clean' or 'modern' without concrete elaboration).",
"max_score": 8
},
{
"name": "Principle test questions and tradeoffs",
"description": "Each design principle includes a practical test question (how you would apply it in critique) AND at least one example tradeoff.",
"max_score": 8
},
{
"name": "Layered token model (four layers)",
"description": "The token architecture uses all four layers: (1) primitive/reference tokens for raw values, (2) semantic/system tokens for intent, (3) component tokens only where a component part or state needs a stable contract, (4) mode/theme tokens.",
"max_score": 10
},
{
"name": "Primitive token leak prohibition",
"description": "The blueprint explicitly states that primitive tokens must not be used directly in product UI — they may only appear inside token definitions.",
"max_score": 8
},
{
"name": "Component as public API",
"description": "Component architecture includes at least 6 of these required elements per component: anatomy, variants, states, content rules, accessibility behavior, code API (props/slots/events), testing requirements, lifecycle status.",
"max_score": 8
},
{
"name": "ROI as estimate with assumptions",
"description": "Any ROI or business case claims are qualified as estimates — the document states assumptions and does NOT present ROI as exact or guaranteed.",
"max_score": 8
},
{
"name": "System framed as product infrastructure",
"description": "The strategy or executive summary frames the design system as product infrastructure (not as a style guide, a Figma kit, or a component gallery).",
"max_score": 6
},
{
"name": "Assumptions clearly labeled",
"description": "Where inputs were missing or inferred, the document labels them as assumptions rather than asking open-ended questions or leaving gaps unaddressed.",
"max_score": 7
},
{
"name": "Foundations before component gallery",
"description": "The roadmap or sequencing places strategy, foundations, and token architecture before component catalog expansion — NOT starting with a component gallery.",
"max_score": 7
}
]
}