CtrlK
BlogDocsLog inGet started
Tessl Logo

implement-posix-command

Implement a new POSIX command as a builtin in the safe shell interpreter

49

Quality

55%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/implement-posix-command/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a narrow technical task but lacks a 'Use when...' clause, which significantly hurts completeness. It provides moderate specificity by naming the domain (POSIX builtins in a safe shell interpreter) but doesn't enumerate concrete sub-actions or include enough natural trigger terms for reliable skill selection.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to add, implement, or create a new builtin command in the safe shell interpreter.'

List more concrete actions such as 'register the command, implement argument parsing, add help text, write tests' to improve specificity.

Include natural trigger term variations like 'add a shell command', 'new builtin', 'shell built-in', and the specific interpreter name to improve matching.

DimensionReasoningScore

Specificity

It names a specific domain (POSIX command, builtin, safe shell interpreter) and one action (implement), but doesn't list multiple concrete actions or elaborate on what implementing entails (e.g., parsing arguments, registering the command, writing tests).

2 / 3

Completeness

It describes what (implement a new POSIX command as a builtin) but has no explicit 'Use when...' clause or equivalent trigger guidance, and the 'when' is entirely missing. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and since the 'when' is not even implied beyond the action itself, this scores a 1.

1 / 3

Trigger Term Quality

Includes some relevant technical keywords like 'POSIX command', 'builtin', and 'shell interpreter', but misses natural variations users might say such as 'add a shell command', 'new builtin', 'shell built-in function', or 'safe shell'. The term 'safe shell interpreter' is somewhat niche and may not match how users phrase requests.

2 / 3

Distinctiveness Conflict Risk

The mention of 'safe shell interpreter' and 'POSIX command as a builtin' is fairly specific and narrows the domain, but without naming the specific interpreter project or codebase, it could overlap with general shell scripting or interpreter development skills.

2 / 3

Total

7

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a highly actionable and well-structured workflow skill for implementing shell builtins, with excellent gate-checking patterns and validation steps throughout. Its main weakness is verbosity — at 500+ lines it pushes the limits of token efficiency, with some content that could be externalized into reference files (pentest checklists, GNU compat tables, test helper boilerplate) and some explanations of concepts Claude already understands (import cycles, why test subdirectories work). The security preamble and execution protocol are well-designed safety measures.

Suggestions

Extract the pentest exercise checklist (Step 8), GNU equivalence test table (Step 4), and fuzz test patterns (Step 9) into separate reference files to reduce the main skill's token footprint by ~40%.

Remove explanations Claude already knows, such as why test-only subdirectories don't create import cycles, what `defer` does, and how Go build tags work — these add ~30 lines of unnecessary context.

DimensionReasoningScore

Conciseness

The skill is extremely thorough and detailed, but it's also very long (~500+ lines) with significant repetition. Template patterns like the test helper code are repeated in multiple steps, and some explanations (e.g., why tests go in subdirectories, import cycle avoidance) explain things Claude already knows. However, much of the content is genuinely necessary for this complex multi-step workflow.

2 / 3

Actionability

The skill provides fully executable code snippets throughout — Go test helpers, bash commands for downloading resources, YAML schema examples, exact function signatures, registry patterns, and specific file paths. Every step has concrete, copy-paste-ready guidance with specific commands to run and exact file locations to create.

3 / 3

Workflow Clarity

The 10-step workflow is exceptionally well-sequenced with explicit gate checks (TaskList verification), parallel step coordination, and clear convergence points. Each step has validation checkpoints (e.g., 'verify tests build and all fail', 'run tests after all fixes', 're-run tests'). The feedback loops in Steps 6-8 are explicit: fix → re-test → verify. The second-pass review in Step 7 is a particularly strong validation pattern.

3 / 3

Progressive Disclosure

The skill references external files well (RULES.md, existing builtins like cat.go, SHELL_FEATURES.md) and organizes content into numbered steps. However, the skill itself is monolithic — all 10 steps with full detail are in a single file. Steps 7-9 alone could each be separate reference documents. The pentest scenarios table and GNU equivalence test table add significant length that could be externalized.

2 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (611 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
DataDog/rshell
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.