CtrlK
BlogDocsLog inGet started
Tessl Logo

pufferlib

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

78

1.50x
Quality

71%

Does it follow best practices?

Impact

87%

1.50x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/pufferlib/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly communicates specific capabilities, includes natural trigger terms, explicitly states when to use it (and when not to), and distinguishes itself from related alternatives. The only minor note is the use of second person 'you need' in the trigger clause, but the description primarily uses third person voice for capability statements. The inclusion of a negative trigger ('use stable-baselines3 instead') is a strong differentiator.

DimensionReasoningScore

Specificity

Lists multiple specific concrete capabilities: fast parallel training, vectorized environments, multi-agent systems, integration with game environments (Atari, Procgen, NetHack), and quantifies performance (2-10x speedups).

3 / 3

Completeness

Clearly answers both 'what' (high-performance RL framework with parallel training, vectorized environments, multi-agent systems, game environment integration) and 'when' (explicit 'Use when you need...' clause plus a 'use X instead' negative trigger for disambiguation).

3 / 3

Trigger Term Quality

Includes strong natural keywords users would say: 'reinforcement learning', 'parallel training', 'vectorized environments', 'multi-agent', 'Atari', 'Procgen', 'NetHack', 'game environments', and even mentions the alternative 'stable-baselines3' for disambiguation.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive by specifying a performance-focused RL niche with named game environments and explicit contrast against stable-baselines3, making it very unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has excellent progressive disclosure structure with clear references to detailed guides, but is significantly too verbose—it could be cut by 40-50% without losing actionable content. Code examples are present but several are incomplete or reference undefined variables, reducing their copy-paste readiness. The workflow sections lack validation checkpoints that would be important for debugging RL training issues.

Suggestions

Cut the 'When to Use This Skill' section (redundant with the description), the 'Tips for Success' section (generic advice Claude knows), and the 'Common Use Cases' section (duplicates earlier examples) to reduce token count by ~40%.

Make code examples fully executable: define `my_policy`, `num_iterations`, and helper methods in the environment example, or use concrete values so examples can be copy-pasted.

Add validation checkpoints to workflows, e.g., 'Run `python -c "import pufferlib; env = pufferlib.make(...); print(env.observation_space)"` to verify environment setup before training' and 'Check SPS output in first 10 iterations to confirm vectorization is working'.

Consolidate the 'Resources' section descriptions—the bullet-point summaries of each reference file repeat information already provided in the 'For complete X, read references/Y.md' links throughout the document.

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~350+ lines. It includes a 'When to Use This Skill' section that restates the description, a 'Tips for Success' section with 10 generic tips Claude already knows (e.g., 'start simple', 'profile early'), 'Common Use Cases' that largely duplicate earlier examples, and extensive resource listings that repeat what's already described in the progressive disclosure sections. The 'Overview' paragraph also restates information Claude can infer from the code examples.

1 / 3

Actionability

The skill provides code examples that appear concrete (training loop, environment creation, policy structure, integration), but several are incomplete or potentially not executable as-is—e.g., the training loop references `my_policy` and `num_iterations` without definition, the environment `step()` calls undefined helper methods, and the `PuffeRL` import and API may not match actual library usage. The CLI examples are more actionable but lack verification steps.

2 / 3

Workflow Clarity

The 'Quick Start Workflow' section provides numbered steps for four different workflows, which is helpful. However, none include validation checkpoints or feedback loops—there's no 'verify your environment works before vectorizing' step with a concrete command, no error recovery guidance, and the workflows read more like checklists of suggestions than validated sequences with explicit verification points.

2 / 3

Progressive Disclosure

The skill excels at progressive disclosure with a clear overview in the main file and well-signaled one-level-deep references to specific reference files (training.md, environments.md, vectorization.md, policies.md, integration.md) and template scripts. Each reference is clearly described with bullet points of what it contains, making navigation easy.

3 / 3

Total

8

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

metadata_version

'metadata.version' is missing

Warning

Total

10

/

11

Passed

Repository
K-Dense-AI/claude-scientific-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.