CtrlK
BlogDocsLog inGet started
Tessl Logo

reflect

Reflect on previus response and output, based on Self-refinement framework for iterative improvement with complexity triage and verification

23

Quality

13%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./plugins/reflexion/skills/reflect/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

27%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is massively over-engineered and verbose, explaining numerous software engineering concepts Claude already knows (SOLID, Clean Architecture, DDD, common libraries, testing patterns). The theatrical 'identity' framing wastes significant tokens. While it provides some structure (triage levels, checklists, report template), the actionable content is buried in hundreds of lines of generic advice that should either be omitted or split into referenced files.

Suggestions

Reduce content by 70-80%: Remove explanations of concepts Claude already knows (SOLID, DDD, code smells, common libraries, testing patterns) and focus only on the specific reflection workflow steps and report format.

Split into multiple files: Extract the code-specific criteria, fact-checking checklist, anti-patterns catalog, and report template into separate referenced files, keeping SKILL.md as a concise overview with clear navigation.

Remove the theatrical identity/threat framing entirely - it wastes tokens and doesn't improve output quality. Replace with a single sentence about maintaining high standards.

Make the triage system actionable with concrete examples: instead of vague categories like 'simple tasks', provide specific triggers (e.g., 'if diff touches <3 files and <50 lines → Quick Path').

DimensionReasoningScore

Conciseness

Extremely verbose at ~500+ lines. Explains concepts Claude already knows extensively (what Clean Architecture is, what SOLID principles are, what code smells are, common libraries like lodash/date-fns, what AAA testing pattern is). Massive amounts of padding with generic software engineering advice that doesn't earn its token cost. The 'identity' section with threats ('you will be killed') is wasteful theatrical framing.

1 / 3

Actionability

Provides checklists and a report format template which are somewhat concrete, but the code examples are trivial illustrations (date formatting) rather than executable guidance for the actual task of self-reflection. The skill is mostly abstract meta-instructions ('evaluate your output against these criteria') rather than specific, copy-paste-ready procedures. The pseudocode decision framework ('IF common utility → Use established library') is vague.

2 / 3

Workflow Clarity

There is a multi-step workflow (triage → assessment → refinement → verification) with some sequencing, but the steps are bloated and lack clear validation checkpoints between them. The triage system (Quick/Standard/Deep) is defined but the thresholds are subjective. The 'decision point' at Step 2 is good but the overall flow is buried in excessive detail, making it hard to follow as an actual workflow.

2 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files and no bundle files provided. Content that could be split into separate reference files (code-specific criteria, fact-checking guidelines, report template, anti-patterns catalog) is all inlined, creating an enormous single document. No navigation aids or clear signposting between major sections.

1 / 3

Total

6

/

12

Passed

Description

0%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is critically weak across all dimensions. It uses abstract, buzzword-heavy language ('self-refinement framework,' 'complexity triage') without specifying concrete actions, natural trigger terms, or explicit guidance on when Claude should select this skill. It also contains a typo ('previus') which further undermines quality.

Suggestions

Replace abstract jargon with concrete actions, e.g., 'Reviews and improves Claude's previous response by identifying errors, simplifying complex reasoning, and verifying factual claims.'

Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks Claude to review, double-check, improve, or refine a previous answer.'

Differentiate from other skills by specifying the unique mechanism or scope, e.g., 'Applies a structured multi-pass review: first triaging response complexity, then checking accuracy, then refining clarity.'

DimensionReasoningScore

Specificity

The description uses vague, abstract language like 'reflect on previous response and output' and 'self-refinement framework for iterative improvement.' No concrete actions are listed—there's no indication of what specific operations this skill performs.

1 / 3

Completeness

The description weakly addresses 'what' (reflect on previous response) but provides no 'when' clause or explicit trigger guidance. There is no 'Use when...' or equivalent, and the 'what' itself is too vague to be useful.

1 / 3

Trigger Term Quality

The description relies on abstract jargon like 'self-refinement framework,' 'complexity triage,' and 'verification,' which are not natural terms a user would say. Users would more likely say 'review your answer,' 'improve your response,' or 'check your work.'

1 / 3

Distinctiveness Conflict Risk

The description is extremely generic—'reflect on previous response' could apply to virtually any skill that involves reviewing or iterating on output. It would easily conflict with editing, debugging, proofreading, or any review-oriented skill.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (651 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
NeoLabHQ/context-engineering-kit
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.