CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/tessl-skill-eval-scenarios

Convert skills to Tessl tiles and create eval scenarios to measure skill effectiveness.

Overall
score

92%

Does it follow best practices?

Validation for skill structure

Overview
Skills
Evals
Files

creating-eval-scenarios

creating-eval-scenarios/SKILL.md

Activation

100%

This is a well-crafted skill description that excels across all dimensions. It provides specific concrete actions, includes natural trigger terms users would actually say, explicitly addresses both what and when, and has distinctive terminology that minimizes conflict risk with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'Generate evaluation scenarios', 'Creates inventory of instructions', 'test cases with success criteria', and 'validates skill coverage'. These are clear, actionable capabilities.

3 / 3

Completeness

Clearly answers both what (generate evaluation scenarios, create inventory, test cases, validate coverage) AND when (explicit 'Use when...' clause with five specific trigger phrases).

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms users would say: 'generate evals', 'create evaluation scenarios', 'test this skill', 'measure skill value', 'prepare for tessl publish'. These are realistic user phrases.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with domain-specific terms like 'Tessl tiles', 'tessl publish', and 'skill effectiveness'. The combination of evaluation generation for a specific platform creates a clear niche unlikely to conflict with general testing or evaluation skills.

3 / 3

Total

12

/

12

Passed

Implementation

73%

This skill is well-structured and concise, with good progressive disclosure to external references. However, it relies heavily on an external file for the core workflow, reducing immediate actionability. The workflow would benefit from explicit validation steps before running evals.

Suggestions

Add a brief inline summary of the key steps from scenario-generation.md so Claude can act without immediately reading the reference

Include explicit validation checkpoint before running evals (e.g., 'Verify instructions.json contains all skill instructions before proceeding')

DimensionReasoningScore

Conciseness

Content is lean and efficient. No unnecessary explanations of concepts Claude knows. Every section serves a purpose with minimal padding.

3 / 3

Actionability

Provides structure and commands but delegates core workflow to an external reference file. The output structure is concrete, but the actual scenario generation process requires reading another file rather than being directly actionable here.

2 / 3

Workflow Clarity

The workflow is implied (read reference → generate files → run evals) but lacks explicit sequencing and validation checkpoints. No feedback loop for verifying generated scenarios before running evals.

2 / 3

Progressive Disclosure

Appropriate structure with clear overview and well-signaled one-level-deep reference to scenario-generation.md. Output structure is inline (appropriate), detailed workflow is externalized (appropriate).

3 / 3

Total

10

/

12

Passed

Validation

81%

Validation13 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata' field is not a dictionary

Warning

license_field

'license' field is missing

Warning

body_steps

No step-by-step structure detected (no ordered list); consider adding a simple workflow

Warning

Total

13

/

16

Passed