CtrlK
BlogDocsLog inGet started
Tessl Logo

giuseppe-trisciuoglio/developer-kit

Comprehensive developer toolkit providing reusable skills for Java/Spring Boot, TypeScript/NestJS/React/Next.js, Python, PHP, AWS CloudFormation, AI/RAG, DevOps, and more.

89

Quality

89%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

Overview
Quality
Evals
Security
Files

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines its scope around prompt engineering workflows, lists specific concrete capabilities, and includes a comprehensive 'Use when...' clause with natural trigger terms. It uses proper third-person voice throughout and is concise without being vague. The description would allow Claude to confidently select this skill from a large pool when users need prompt engineering assistance.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'write, debug, and optimize prompts', 'few-shot example selection', 'chain-of-thought structuring', 'system prompt design', and 'template composition'. These are clearly defined capabilities.

3 / 3

Completeness

Clearly answers both 'what' (workflows to write, debug, and optimize prompts with specific sub-capabilities) and 'when' (explicit 'Use when...' clause listing multiple trigger scenarios). Both halves are well-developed.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'write or improve a prompt', 'few-shot examples', 'chain-of-thought', 'system prompts', 'prompt templates', 'get better results from an LLM'. These are all phrases users naturally use when seeking prompt engineering help.

3 / 3

Distinctiveness Conflict Risk

The skill occupies a clear niche around prompt engineering for LLMs. Terms like 'few-shot examples', 'chain-of-thought', 'system prompt design', and 'prompt templates' are highly specific to this domain and unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has good structural organization and progressive disclosure, with clear references to supporting files. However, it is significantly too verbose for Claude, explaining well-known concepts at length and using abstract placeholder templates instead of concrete, executable examples. The workflows provide reasonable sequencing but lack the specificity and concrete validation tooling needed for full actionability.

Suggestions

Cut explanatory text that describes concepts Claude already knows (e.g., what few-shot learning is, what CoT reasoning is, what system prompts do) — keep only the novel decision rules and concrete patterns.

Replace placeholder-heavy templates (e.g., `{break_down_the_problem}`, `{detailed_reasoning}`) with fully worked, copy-paste-ready examples for at least one real domain.

Add concrete validation commands or scripts to the workflows — e.g., a specific Python snippet or shell command to measure accuracy across test cases, rather than just saying 'measure performance'.

Remove the 'Integration with Other Skills' section entirely — it adds no actionable guidance and wastes tokens on cross-references Claude can discover on its own.

DimensionReasoningScore

Conciseness

The skill is excessively verbose for its target audience (Claude). It explains concepts Claude already knows well (what few-shot learning is, what chain-of-thought reasoning is, what system prompts are). Many sections are padded with generic advice ('Measure baseline performance before optimization attempts', 'Implement single-variable changes for accurate attribution') that Claude inherently understands. The 'Integration with Other Skills' section and repeated references to the same files add bloat. The template examples use placeholder variables rather than being truly actionable, making them filler.

1 / 3

Actionability

The sentiment classification few-shot example is concrete and usable, and the workflows provide step-by-step processes. However, most templates use abstract placeholders (e.g., `{break_down_the_problem}`, `{detailed_reasoning}`) that are essentially pseudocode rather than executable guidance. The optimization workflow describes what to do conceptually but doesn't provide concrete commands, scripts, or measurement tools. The CoT template and system prompt framework are structural outlines rather than copy-paste-ready artifacts.

2 / 3

Workflow Clarity

The three workflows are clearly sequenced with numbered steps, and Workflow 2 includes a feedback loop (revert if accuracy < baseline, return to step 2 if < 90%). However, validation checkpoints are somewhat vague — 'test with at least 3 inputs' lacks specifics on how to measure or what tools to use. The quality gates section adds concrete thresholds (>90% accuracy, <5% variance) but these aren't integrated into the workflow steps as explicit checkpoints. Missing concrete validation mechanisms for the 'Scale Prompt Systems' workflow.

2 / 3

Progressive Disclosure

The skill effectively uses progressive disclosure by keeping the main workflow in SKILL.md and pointing to five clearly-labeled reference files for deeper pattern details. References are one level deep, well-signaled throughout the document (inline mentions plus a dedicated Resources section), and the instruction to 'load the targeted reference files only for the pattern you are applying' is explicit. Content is appropriately split between overview and detailed references.

3 / 3

Total

8

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

Total

10

/

11

Passed

Reviewed

Table of Contents