Interactive skill creation and eval-driven optimization. Triggers: create a skill, make a skill, new skill, scaffold skill, optimize skill, run evals, improve skill. Uses AskUserQuestion for interview; WebSearch for research; Bash for eval execution. Outputs: complete skill directory with SKILL.md, tile.json, evals, and repo integration.
93
94%
Does it follow best practices?
Impact
91%
1.26xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent produces a correctly structured skill scaffold when given a complete decision map. Focuses on tile.json shape, SKILL.md frontmatter and structure, wording quality, and mandatory repo/CI integration — all of which are prescribed by the skill but would vary widely without it.",
"type": "weighted_checklist",
"checklist": [
{
"name": "tile.json namespace",
"description": "tile.json name field uses the oh-my-ai/<skill-name> namespace (not a bare name or other prefix)",
"max_score": 8
},
{
"name": "tile.json required fields",
"description": "tile.json contains all four required top-level fields: name, version, private, summary, and a skills mapping with a path key",
"max_score": 8
},
{
"name": "metadata.version present",
"description": "SKILL.md frontmatter includes a metadata.version field (e.g. metadata:\\n version: \"1.0.0\")",
"max_score": 8
},
{
"name": "Trigger terms in description",
"description": "The SKILL.md frontmatter description field contains at least 3 concrete trigger terms — words a user would actually say to invoke the skill",
"max_score": 8
},
{
"name": "Non-negotiables section",
"description": "SKILL.md contains a numbered non-negotiables section near the top of the document body (before process steps)",
"max_score": 8
},
{
"name": "Non-negotiables imperative wording",
"description": "Non-negotiables use strict imperative language (e.g. 'always', 'never', 'must', 'do') — does NOT use fuzzy language ('consider', 'may want to', 'try to', 'optionally') for required behaviors",
"max_score": 8
},
{
"name": "All five body sections",
"description": "SKILL.md body includes all five required sections: title/one-liner, non-negotiables, process, integrated example, and anti-patterns",
"max_score": 8
},
{
"name": "SKILL.md length",
"description": "SKILL.md is between 150 and 400 lines long",
"max_score": 6
},
{
"name": "Integrated example realism",
"description": "The integrated example in SKILL.md starts from a realistic input (not a toy/placeholder), includes at least one critical decision point, and shows the final output format",
"max_score": 8
},
{
"name": "README row added",
"description": "README.md is updated with a new row for the skill in the skills table, inserted in alphabetical order",
"max_score": 10
},
{
"name": "CI workflow updated",
"description": ".github/workflows/tessl-publish.yml (or equivalent CI file) has the new skill's path appended to the tile/skills array",
"max_score": 10
},
{
"name": "Anti-patterns section",
"description": "SKILL.md contains an anti-patterns section listing at least 3 specific pitfalls to avoid",
"max_score": 8
},
{
"name": "Critical rules not buried",
"description": "The non-negotiables section contains at least as many items as the process section has phases — key constraints appear at the top, NOT only deep in process steps or anti-patterns",
"max_score": 2
}
]
}