tessl-labs/tessl-skill-eval-scenarios

Convert skills to Tessl tiles and create eval scenarios to measure skill effectiveness.

Overall
score

92%

Review — 92%

Does it follow best practices?

Validation — 26 / 32 Passed

Validation for skill structure

All skills

creating-eval-scenarios

creating-eval-scenarios/SKILL.md

Activation

100%

This is a well-crafted skill description that excels across all dimensions. It provides specific concrete actions, includes natural trigger terms users would actually say, explicitly addresses both what and when, and has distinctive terminology that minimizes conflict risk with other skills.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Generate evaluation scenarios', 'Creates inventory of instructions', 'test cases with success criteria', and 'validates skill coverage'. These are clear, actionable capabilities.	3 / 3
Completeness	Clearly answers both what (generate evaluation scenarios, create inventory, test cases, validate coverage) AND when (explicit 'Use when...' clause with five specific trigger phrases).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would say: 'generate evals', 'create evaluation scenarios', 'test this skill', 'measure skill value', 'prepare for tessl publish'. These are realistic user phrases.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with domain-specific terms like 'Tessl tiles', 'tessl publish', and 'skill effectiveness'. The combination of evaluation generation for a specific platform creates a clear niche unlikely to conflict with general testing or evaluation skills.	3 / 3
	Total	12 / 12 Passed

Implementation

73%

This skill is well-structured and concise, with good progressive disclosure to external references. However, it relies heavily on an external file for the core workflow, reducing immediate actionability. The workflow would benefit from explicit validation steps before running evals.

Suggestions

Add a brief inline summary of the key steps from scenario-generation.md so Claude can act without immediately reading the reference

Include explicit validation checkpoint before running evals (e.g., 'Verify instructions.json contains all skill instructions before proceeding')

Dimension	Reasoning	Score
Conciseness	Content is lean and efficient. No unnecessary explanations of concepts Claude knows. Every section serves a purpose with minimal padding.	3 / 3
Actionability	Provides structure and commands but delegates core workflow to an external reference file. The output structure is concrete, but the actual scenario generation process requires reading another file rather than being directly actionable here.	2 / 3
Workflow Clarity	The workflow is implied (read reference → generate files → run evals) but lacks explicit sequencing and validation checkpoints. No feedback loop for verifying generated scenarios before running evals.	2 / 3
Progressive Disclosure	Appropriate structure with clear overview and well-signaled one-level-deep reference to scenario-generation.md. Output structure is inline (appropriate), detailed workflow is externalized (appropriate).	3 / 3
	Total	10 / 12 Passed

Validation

81%

Warnings & errors only

Validation — 13 / 16 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata' field is not a dictionary	Warning
license_field	'license' field is missing	Warning
body_steps	No step-by-step structure detected (no ordered list); consider adding a simple workflow	Warning

	Total	13 / 16 Passed