CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/tessl-publish-public

Ensure Tessl tiles meet all requirements for public registry publishing with comprehensive validation, quality gates, and evaluation scenarios. Use when preparing skills for public Tessl release, validating tile.json configuration, creating evaluation scenarios, enforcing quality thresholds, or checking agent-agnostic compliance. Keywords: tessl, tile, publishing, public-registry, validation, quality-gates, tile.json, evaluation-scenarios, skill-publishing

94

Quality

94%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

scenario-02.mdevaluation-scenarios/

Scenario 02: Creating Missing Evaluation Scenarios

User Prompt

"Create evaluation scenarios for the docker-containerization skill before publishing publicly."

Expected Behavior

  1. Agent locates the skill directory at skills/infrastructure/docker-containerization/
  2. Creates evaluation-scenarios/ directory if it doesn't exist
  3. Analyzes the SKILL.md content to understand skill purpose and workflows
  4. Generates 5-8 comprehensive scenario files (scenario-01.md through scenario-08.md)
  5. Each scenario includes: user prompt, expected behavior, success criteria, failure conditions
  6. Scenarios cover diverse use cases from the skill (basic, intermediate, edge cases)
  7. Uses concrete examples with measurable outcomes

Success Criteria

  • Agent creates evaluation-scenarios/ directory
  • Agent generates minimum 5 scenario files (scenario-01.md to scenario-05.md)
  • Each scenario file has all four required sections (prompt, behavior, success, failure)
  • Scenarios are specific to the skill domain (Docker containerization)
  • Success criteria are measurable (files created, commands run, outputs verified)
  • Failure conditions clearly indicate when skill was not applied correctly
  • Scenarios cover breadth of skill capabilities (basic → advanced)

Failure Conditions

  • Agent creates fewer than 5 scenarios
  • Agent uses generic scenarios not specific to Docker containerization
  • Scenarios lack measurable success criteria (vague "agent does well")
  • Missing any required section (prompt, behavior, success, failure)
  • Scenarios test implementation details rather than skill application
  • Agent copies scenarios from another skill without adaptation

evaluation-scenarios

scenario-01.md

scenario-02.md

scenario-03.md

scenario-04.md

scenario-05.md

scenario-06.md

scenario-07.md

scenario-08.md

scenario-09.md

SKILL.md

tile.json