CtrlK
BlogDocsLog inGet started
Tessl Logo

implement-posix-command

Implement a new POSIX command as a builtin in the safe shell interpreter

49

Quality

55%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.claude/skills/implement-posix-command/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a highly actionable and well-structured skill with an excellent workflow that includes explicit gating, parallel execution, validation checkpoints, and multi-pass review. Its main weakness is its sheer length — at 500+ lines it pushes the boundaries of token efficiency, with some sections (pentest categories, fuzz corpus guidance, GNU compat tables) that could be extracted into referenced sub-files. The security-first framing and prompt injection warnings are a notable strength.

Suggestions

Extract the pentest exercise categories (Step 8), fuzz test patterns (Step 9), and GNU equivalence test table (Step 4) into separate referenced markdown files to reduce the main skill's token footprint by ~40%.

Consolidate the repeated test helper code (runScript, runScriptCtx, cmdRun) into a single referenced file rather than duplicating the full implementation in both Step 4 and Step 8 sections.

DimensionReasoningScore

Conciseness

The skill is extremely thorough and detailed, but it's also very long (~500+ lines) with significant repetition. Template patterns like the test helper code are repeated verbatim, and some sections (e.g., GNU equivalence test table, pentest categories) are exhaustive to the point of verbosity. However, most content is genuinely instructive rather than explaining things Claude already knows — it's more 'comprehensive' than 'padded'.

2 / 3

Actionability

The skill provides fully executable Go code snippets, exact bash commands, specific file paths, concrete YAML schema examples, and precise function signatures. Every step has copy-paste-ready code and specific instructions — from the test helper functions to the registry entry format to the exact `go test` commands to run.

3 / 3

Workflow Clarity

The workflow is exceptionally well-structured with 10 explicit steps, clear gate checks between steps, parallel execution rules, and explicit validation checkpoints (TaskList verification before each step, test runs in Step 6, two-pass review in Step 7). The execution protocol at the top with the dependency graph (Step 1 → Step 2 → Steps 3+4+5 parallel → Step 6 → 7 → 8 → 9 → 10) is exemplary. Feedback loops are present (fix → re-validate → re-run tests).

3 / 3

Progressive Disclosure

The skill is a monolithic document with no references to supporting files for detailed content. The pentest scenarios, GNU equivalence test table, fuzz test patterns, and code review checklist could all be split into separate reference files. The document references external files like RULES.md and existing builtins appropriately, but the skill body itself is a wall of text that would benefit from being split into sub-documents.

2 / 3

Total

10

/

12

Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a narrow technical task but is too terse and lacks a 'Use when...' clause, making it difficult for Claude to know when to select this skill. It would benefit from listing concrete sub-actions and explicit trigger conditions to improve both completeness and discoverability.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to add, implement, or register a new builtin command in the safe shell interpreter.'

List more specific concrete actions such as 'parse command arguments, register the command in the builtin table, implement execution logic, and add tests.'

Include natural trigger term variations like 'add a shell command', 'new builtin', 'shell interpreter command', or the specific project name if applicable.

DimensionReasoningScore

Specificity

It names a specific domain (POSIX command, builtin, safe shell interpreter) and one action (implement), but doesn't list multiple concrete actions or elaborate on what implementing entails (e.g., parsing arguments, registering the command, writing tests).

2 / 3

Completeness

It describes what (implement a new POSIX command as a builtin) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' itself is also thin, so this scores a 1.

1 / 3

Trigger Term Quality

Includes some relevant technical keywords like 'POSIX command', 'builtin', and 'shell interpreter', but misses natural user variations such as 'add a command', 'shell builtin', 'new shell command', or 'safe shell'. The term 'safe shell interpreter' is somewhat niche and may not match how users phrase requests.

2 / 3

Distinctiveness Conflict Risk

The mention of 'safe shell interpreter' and 'POSIX command as a builtin' is fairly specific and narrows the domain, but without more context about which shell interpreter or project, it could overlap with general shell scripting or interpreter development skills.

2 / 3

Total

7

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (611 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
DataDog/rshell
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.