Skills are the new Code by Guy Podjarny
89
90%
Does it follow best practices?
Impact
87%
1.38xAverage score across 4 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent uses Guy's own framing and verbatim quotes first when teaching a concept, cites transcript line ranges, preserves transcription artifacts verbatim, marks any additions beyond the talk as 'not from the talk', and does not attribute fabricated claims to Guy.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Verbatim quote with line citation",
"description": "harness-explainer.md includes at least one quoted passage copied character-for-character from transcript.md, accompanied by a transcript line citation (e.g., 'transcript.md L99–L138' or a sub-range)",
"max_score": 20
},
{
"name": "Guy's framing appears first",
"description": "The definition and core explanation of harnesses uses Guy's own words or framing before any comparisons, analogies, or additions contributed by the agent",
"max_score": 20
},
{
"name": "Additions marked as not from the talk",
"description": "Any content that goes beyond what the transcript says is explicitly labelled as the agent's own addition (e.g., 'not from the talk', 'my own note', or equivalent marker) — not presented as Guy's position",
"max_score": 25
},
{
"name": "Transcription artifact preserved",
"description": "If the explainer quotes the passage where Guy says 'Claude code is a hardness', the word 'hardness' is reproduced verbatim (not silently corrected to 'harness'), optionally with a parenthetical note about the likely intended word",
"max_score": 15
},
{
"name": "No fabricated Guy quotes",
"description": "No statement inside quotation marks attributed to Guy appears in harness-explainer.md that cannot be found verbatim in transcript.md",
"max_score": 20
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4