CtrlK
BlogDocsLog inGet started
Tessl Logo

improve-loop

Systematically review and improve every shell feature and builtin command. Iterates through each feature/command, runs code-review, fixes issues, and re-reviews until clean.

66

1.56x
Quality

52%

Does it follow best practices?

Impact

91%

1.56x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/improve-loop/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

50%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description communicates a clear domain (shell features and builtins) and a defined workflow (iterative review and fix), but lacks explicit trigger guidance ('Use when...') and natural user-facing keywords. It would benefit from more specific actions and common terminology users would employ when requesting this kind of work.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to audit, review, or improve shell builtins, bash/zsh features, or shell script quality.'

Include natural trigger terms users would say, such as 'bash', 'zsh', 'shell script', 'audit builtins', 'lint shell code', or specific shell names.

List more concrete actions beyond the generic review loop, e.g., 'checks for POSIX compliance, identifies deprecated syntax, improves error handling in shell builtins.'

DimensionReasoningScore

Specificity

Names the domain (shell features and builtin commands) and describes a process (iterate, run code-review, fix issues, re-review), but the actions are somewhat generic and process-oriented rather than listing specific concrete capabilities.

2 / 3

Completeness

Describes what it does (review and improve shell features/builtins via iterative code review) but lacks an explicit 'Use when...' clause or equivalent trigger guidance, which caps this at 2 per the rubric guidelines.

2 / 3

Trigger Term Quality

Includes relevant terms like 'shell', 'builtin command', 'code-review', and 'review', but misses common user variations like 'bash', 'zsh', 'shell script', 'lint', 'audit', or specific shell names that users would naturally mention.

2 / 3

Distinctiveness Conflict Risk

The focus on shell features and builtin commands provides some specificity, but 'code-review' and 'fixes issues' are broad enough to overlap with general code review or linting skills. The iterative review process adds some distinction but not enough to be clearly unique.

2 / 3

Total

8

/

12

Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is extremely thorough and actionable with an exemplary workflow structure including gate checks, parallel execution, feedback loops, and decision tables. However, it is severely over-long and monolithic — the entire content is crammed into a single file with no progressive disclosure, making it a massive token burden. The review criteria sections (A-K) alone constitute a reference manual that should be extracted into separate files.

Suggestions

Extract the review dimensions (A through K) into a separate REVIEW_CRITERIA.md file and reference it from the main skill, reducing the SKILL.md body by ~60%.

Extract the output format templates (findings format, PR comment templates, final summary format) into a TEMPLATES.md file to reduce inline verbosity.

Remove redundant reminders that appear multiple times (e.g., 'fix implementation not tests' appears in Steps 2C, 2D, 3B, and Important Rules) — state each rule once and reference it.

Trim explanatory text that Claude can infer, such as 'The randomized order ensures that each run of the improve loop covers targets in a different sequence, avoiding systematic bias toward alphabetically early targets' — the command `sort -R` is self-explanatory.

DimensionReasoningScore

Conciseness

Extremely verbose at ~500+ lines. Massive amounts of detail that Claude could infer or that repeat themselves. The review dimensions alone (A through K) are exhaustive to the point of being a reference manual rather than a concise skill. Many sections restate the same rules (e.g., 'fix implementation not tests' appears multiple times). The step-by-step protocol with all its sub-steps, gate checks, and repeated completion checks adds significant token overhead.

1 / 3

Actionability

Highly actionable with concrete bash commands, exact tool invocations (gh pr comment, go test, gofmt), specific file paths, precise output formats, and copy-paste ready code blocks throughout. The review dimensions provide specific checks with exact function names and patterns to look for.

3 / 3

Workflow Clarity

Exceptionally clear multi-step workflow with explicit gating checks (TaskList verification before each step), numbered sub-steps, decision tables for branching logic, feedback loops (fix → test → retry up to 3 times), and a clear visual flow diagram. Validation checkpoints are explicit at every stage including test runs after fixes and a full sweep re-review.

3 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files for detailed content. The review dimensions (A-K) alone could be a separate REVIEW_CRITERIA.md file. The output format templates, the agent launch instructions, and the pentest checks could all be split into referenced files. Everything is inlined into one massive document with no bundle files to support it.

1 / 3

Total

8

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (552 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
DataDog/rshell
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.