Use when the user wants to design, critique, or refine a website theme, visual style, look and feel, branding, CSS theme, design system, tokens.css file, or CSS token system. Produces theme directions, typography/color/composition/motion systems, implementation-ready CSS custom properties, component motif rules, and light/dark/density variant strategy.
99
100%
Does it follow best practices?
Impact
99%
1.45xAverage score across 7 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent can repair a drifted CSS handoff into a production-grade theme system. The strongest answers preserve the provided civic field-ledger direction, remove generic SaaS/dashboard drift, convert raw component values into semantic tokens, define exactly four reusable motifs, and cover accessibility, variants, responsiveness, performance, and developer extension rules.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Direction-preserving audit",
"description": "The notes identify what drifted from the provided direction and preserve the core civic field-ledger contract instead of inventing a fresh brand. The answer should connect repairs to the adjectives, primary archetype, anti-target, and archive use cases.",
"max_score": 8
},
{
"name": "Semantic token architecture is complete",
"description": "The repaired CSS defines a stable semantic token architecture covering color roles, RGB or alpha helpers, typography roles, type scale, spacing rhythm, borders, radii, surface/depth, focus offsets, hit sizes, font weights, durations, easings, and transform distances.",
"max_score": 14
},
{
"name": "Raw reusable values are eliminated",
"description": "Reusable component guidance no longer contains raw hex colors, rgba() values, gradients, font family names, fixed spacing, numeric font weights, border widths, transform distances, focus offsets, opacity values, or border radii unless those values are defined as tokens first. Do not penalize unitless arithmetic multipliers inside calc() expressions, structural zeroes, CSS keywords, or local mechanics that are not reusable theme decisions.",
"max_score": 14
},
{
"name": "State and accessibility contract is explicit",
"description": "The repaired system tokenizes hover, active, selected, disabled, focus-visible, invalid/error, loading, and empty states; includes 44px minimum control targets; and provides real prefers-reduced-motion plus forced-colors or prefers-contrast overrides.",
"max_score": 14
},
{
"name": "Signature motifs exactly four",
"description": "The handoff notes replace the six prototype motifs with exactly four named motifs grounded in the climate archive direction. Palette, type, geometry, data marks, and motion may support those motifs but should not appear as extra motif-like items.",
"max_score": 10
},
{
"name": "Motifs map to archive components",
"description": "Each motif is mapped to repeated behavior across multiple archive components such as navigation, search, filters, record rows, metadata tables, evidence cards, chart or map panels, source callouts, and download actions. Score the combination of motif notes and component CSS; the answer does not need to mention every example component in every motif block, and component CSS coverage is sufficient for chart/map panels when motif behavior is otherwise clear.",
"max_score": 10
},
{
"name": "Variant and responsive rules are stable",
"description": "The repaired system states the default mode and includes stable-name light/dark or theme overrides, density overrides for compact archive browsing, and responsive token or grid rules that preserve the field-ledger identity on mobile. If notes mention a larger breakpoint ladder, do not require every named stop to be implemented when implemented breakpoints plus extension rules are coherent and no component depends on the omitted stop.",
"max_score": 10
},
{
"name": "Migration delta is actionable",
"description": "The notes explain the before-to-after implementation delta: which prototype decisions were preserved, which were removed, what replaced them, and how developers should extend the system without adding local exceptions.",
"max_score": 8
},
{
"name": "Performance constraints are concrete",
"description": "The answer addresses at least two performance or Core Web Vitals concerns such as LCP, INP, CLS, font loading, image or map loading, backdrop-filter removal, paint cost, composited animation, or layout shift prevention.",
"max_score": 6
},
{
"name": "Deliverables are buildable",
"description": "The solution creates both requested files, and the CSS plus notes are specific enough for developers to apply across new archive components without returning to the original prose brief.",
"max_score": 6
}
]
}