CtrlK
BlogDocsLog inGet started
Tessl Logo

tree-of-thoughts

Execute tasks through systematic exploration, pruning, and expansion using Tree of Thoughts methodology with meta-judge evaluation specifications and multi-agent evaluation

34

Quality

31%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./plugins/sadd/skills/tree-of-thoughts/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides an exceptionally detailed and actionable Tree of Thoughts methodology with clear workflow sequencing, adaptive strategy selection, and concrete prompt templates. However, it is severely over-engineered for a single SKILL.md file — the massive inline prompt templates, repeated instructions, and explanatory content that Claude already understands make it extremely token-inefficient. The content desperately needs decomposition into separate reference files with a concise overview in the main SKILL.md.

Suggestions

Extract all prompt templates into separate bundle files (e.g., prompts/explorer.md, prompts/pruning-judge.md, prompts/synthesizer.md) and reference them from the main SKILL.md with one-line descriptions.

Remove metacognitive scaffolding instructions within prompts that Claude already knows (e.g., 'Let's approach this systematically', 'Think step by step') — these waste tokens without adding value.

Condense the main SKILL.md to a ~100-line overview covering the phase diagram, decision logic, file naming conventions, and links to detailed prompt templates and examples in bundle files.

Remove the extensive worked example section or move it to a separate EXAMPLES.md file, keeping only a brief 5-line summary of expected outputs in the main file.

DimensionReasoningScore

Conciseness

Extremely verbose at ~600+ lines. Massive prompt templates are repeated with extensive inline instructions that Claude already knows how to do (e.g., 'Let's approach this systematically', 'Break down the task'). The ASCII diagram, extensive examples, and repeated CRITICAL notes add significant token overhead. Much of this could be condensed to a fraction of the size.

1 / 3

Actionability

Highly actionable with complete prompt templates, specific file naming conventions, concrete dispatch instructions, exact output formats, decision logic with thresholds, and a worked end-to-end example. Every phase has copy-paste ready prompts and clear tool invocation patterns.

3 / 3

Workflow Clarity

Excellent multi-step workflow with clear phase sequencing, explicit parallelism instructions (e.g., 'Launch meta-judge in parallel with Phase 1'), validation checkpoints ('Wait for BOTH Phase 1 AND Phase 1.5 to complete'), adaptive branching logic with concrete thresholds (scores <3.0 → REDESIGN), and feedback loops (redesign returns to Phase 3, escalate after two failures).

3 / 3

Progressive Disclosure

Monolithic wall of text with no bundle files or external references. All prompt templates, examples, and detailed instructions are inline. The prompt templates alone could be separate files referenced from a concise overview. No bundle files are provided despite the content clearly warranting decomposition into separate reference files.

1 / 3

Total

8

/

12

Passed

Description

7%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is heavily laden with technical jargon and abstract methodology terms without providing concrete actions, natural trigger terms, or explicit usage guidance. It would be very difficult for Claude to correctly select this skill from a pool of options because it doesn't clearly communicate what it does in practical terms or when it should be invoked.

Suggestions

Replace abstract methodology terms with concrete actions describing what the skill actually produces (e.g., 'Generates and evaluates multiple solution paths for complex problems, ranking alternatives by quality').

Add an explicit 'Use when...' clause with natural trigger terms users would say, such as 'Use when the user asks to explore multiple approaches, compare solutions, or needs structured brainstorming for complex decisions'.

Remove or minimize jargon like 'meta-judge evaluation specifications' and 'multi-agent evaluation' — instead describe the practical benefit these provide in plain language.

DimensionReasoningScore

Specificity

The description uses abstract, buzzword-heavy language ('systematic exploration', 'pruning', 'expansion', 'meta-judge evaluation specifications', 'multi-agent evaluation') without listing any concrete actions a user would recognize. It does not describe what specific tasks are performed or what outputs are produced.

1 / 3

Completeness

The description vaguely addresses 'what' (execute tasks through Tree of Thoughts) but provides no 'when' clause or explicit trigger guidance. There is no 'Use when...' or equivalent, and the 'what' itself is too abstract to be useful.

1 / 3

Trigger Term Quality

The terms used are highly technical jargon ('Tree of Thoughts methodology', 'meta-judge evaluation specifications', 'multi-agent evaluation') that users would almost never naturally say. A user needing this capability would likely say something like 'brainstorm', 'explore options', 'compare approaches', or 'think step by step' — none of which appear here.

1 / 3

Distinctiveness Conflict Risk

The mention of 'Tree of Thoughts' and 'meta-judge evaluation' gives it some distinctiveness from generic skills, but the phrase 'execute tasks' is extremely broad and could overlap with virtually any task-execution skill. The niche is somewhat identifiable but poorly bounded.

2 / 3

Total

5

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (943 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
NeoLabHQ/context-engineering-kit
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.