CtrlK
BlogDocsLog inGet started
Tessl Logo

improve-loop

Systematically review and improve every shell feature and builtin command. Iterates through each feature/command, runs code-review, fixes issues, and re-reviews until clean.

60

1.56x
Quality

43%

Does it follow best practices?

Impact

91%

1.56x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/improve-loop/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is extremely thorough and actionable with an exemplary workflow structure including gating, validation loops, and parallel execution patterns. However, it is severely over-long and monolithic — the review checklist dimensions, PR comment templates, and detailed sub-instructions should be split into referenced files to respect token budget. The content explains many things Claude already knows (git commands, gofmt usage, how to post PR comments) which inflates the token cost significantly.

Suggestions

Extract the review dimensions (A through K) into a separate REVIEW_CHECKLIST.md file and reference it from the main skill — this alone would cut ~40% of the content

Extract PR comment templates and output format specifications into a TEMPLATES.md file referenced from the relevant steps

Remove explanations of basic tools Claude already knows (git add/commit/push, gofmt, gh pr comment syntax) and replace with just the specific commands needed

Consolidate repeated gate-check and completion-check boilerplate into a single 'Protocol' section at the top, rather than restating it in every step

DimensionReasoningScore

Conciseness

Extremely verbose at ~500+ lines. Massive amounts of detail that Claude could infer or derive — e.g., explaining what gofmt does, how to run git commands, how to post PR comments. The review dimensions (A through K) are exhaustive checklists that could be in a separate reference file. Significant repetition of gate checks, completion checks, and logging instructions throughout.

1 / 3

Actionability

Highly actionable with specific executable bash commands, exact tool call patterns (Agent, TaskCreate, TaskList), concrete commit message formats, PR comment templates, and detailed review dimensions with specific code patterns to check. Every step has copy-paste ready commands.

3 / 3

Workflow Clarity

Exceptionally clear multi-step workflow with explicit gating checks between steps, a visual flow diagram, decision tables for loop continuation, validation checkpoints (run tests after every fix, re-run on failure up to 3 times), and explicit completion checks for every step. The batch loop with sub-steps 2A-2G is well-sequenced with clear entry/exit conditions.

3 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files for detailed content. The review dimensions (A-K) alone are ~150 lines that should be in a separate REVIEW_CHECKLIST.md. The PR comment templates, output formats, and pentest checks could all be split into referenced files. Everything is inlined into a single massive document with no bundle files to support it.

1 / 3

Total

8

/

12

Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description conveys a reasonable sense of what the skill does — iteratively reviewing and fixing shell builtins — but lacks explicit trigger guidance ('Use when...'), concrete action details, and natural user-facing keywords. It would benefit from specifying when Claude should select this skill and including more specific terms users might use.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to audit, review, or improve shell builtins, bash/zsh built-in commands, or shell feature implementations.'

Include more natural trigger terms and variations such as 'bash', 'zsh', 'shell script', 'builtin audit', 'lint shell code'.

Make the actions more concrete — specify what 'fixes issues' means (e.g., 'fixes correctness bugs, error handling, POSIX compliance, edge cases').

DimensionReasoningScore

Specificity

Names the domain (shell features and builtin commands) and describes a process (iterate, run code-review, fix issues, re-review), but the actions are somewhat generic — 'fixes issues' and 'runs code-review' lack concrete detail about what kinds of issues or what the review entails.

2 / 3

Completeness

Describes what it does (review and improve shell features/builtins) but has no explicit 'Use when...' clause or trigger guidance. Per the rubric, a missing 'Use when' clause caps completeness at 2, and the 'when' is not even implied clearly, so this scores at 1.

1 / 3

Trigger Term Quality

Includes some relevant terms like 'shell', 'builtin command', 'code-review', but misses common user-facing variations (e.g., 'bash', 'zsh', 'shell script', 'lint', 'audit'). The phrase 'systematically review' is somewhat natural but not a typical user trigger.

2 / 3

Distinctiveness Conflict Risk

The focus on shell builtins and iterative code-review is somewhat distinctive, but 'code-review' and 'fixes issues' are broad enough to overlap with general code review or linting skills. The niche of shell builtins provides some differentiation.

2 / 3

Total

7

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (552 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
DataDog/rshell
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.