pufferlib

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

1.50x

Quality

67%

Does it follow best practices?

Impact

87%

1.50x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/pufferlib/SKILL.md

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly communicates specific capabilities, includes natural trigger terms, explicitly states both what the skill does and when to use it, and even provides guidance on when to use an alternative. The inclusion of concrete examples (Atari, Procgen, NetHack), quantified performance claims (2-10x speedups), and a disambiguation clause against stable-baselines3 make this a strong, well-crafted description.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete capabilities: fast parallel training, vectorized environments, multi-agent systems, integration with game environments (Atari, Procgen, NetHack), and quantifies performance (2-10x speedups).	3 / 3
Completeness	Clearly answers both 'what' (high-performance RL framework with parallel training, vectorized environments, multi-agent systems, game environment integration) and 'when' (explicit 'Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments'). Also includes a 'when NOT to use' clause pointing to stable-baselines3.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms users would say: 'reinforcement learning', 'parallel training', 'vectorized environments', 'multi-agent', 'Atari', 'Procgen', 'NetHack', 'speedups'. Also mentions the alternative 'stable-baselines3' which helps with disambiguation.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with a clear niche (high-performance RL) and explicit differentiation from stable-baselines3. The specific game environments (Atari, Procgen, NetHack) and performance focus make it unlikely to conflict with other skills.	3 / 3
	Total	12 / 12 Passed

Implementation

35%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill covers PufferLib comprehensively with reasonable structure and code examples, but suffers significantly from verbosity — repeating information across sections, including generic advice Claude doesn't need, and providing incomplete code examples that fall short of being truly executable. The progressive disclosure structure is conceptually sound with references to external files, but the main file itself contains too much redundant content that undermines the overview-to-detail pattern.

Suggestions

Cut the 'When to Use This Skill', 'Tips for Success', and 'Common Use Cases' sections entirely — they repeat information already covered in the core capabilities and workflows, and Claude can infer appropriate usage from the skill description.

Make code examples fully executable: define `obs_dim`, `num_actions`, `num_iterations`, and `my_policy` in examples, or use concrete values. Replace placeholder methods in PufferEnv with minimal working implementations.

Add validation checkpoints to workflows: e.g., 'Test environment with `env.reset()` and manual `step()` calls before vectorizing' and 'Verify training convergence by checking reward curves after 100k steps'.

Consolidate the 'Resources' section into the existing reference links within each capability section — the detailed file descriptions are redundant with the 'For complete X, read references/Y.md' patterns already present.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~350+ lines. It includes extensive 'When to Use This Skill' bullets, 'Tips for Success' with 10 items of generic advice ('Start simple', 'Profile early'), redundant 'Common Use Cases' that repeat earlier examples, a 'Resources' section that re-describes what each reference file contains (already listed inline), and explanatory text Claude doesn't need (e.g., 'PufferLib is a high-performance reinforcement learning library designed for...'). Much of this could be cut by 50%+ without losing actionable content.	1 / 3
Actionability	Code examples are provided and appear mostly executable, but several are incomplete or uncertain — the Python training loop references `my_policy` and `num_iterations` without definition, the PufferEnv example has placeholder methods (`_get_observation`, `_compute_reward`, `_is_done`) that aren't implemented, and the Policy example uses undefined `obs_dim` and `num_actions`. These are closer to pseudocode than copy-paste ready.	2 / 3
Workflow Clarity	The 'Quick Start Workflow' section provides numbered steps for four different workflows, which is helpful. However, none include validation checkpoints or feedback loops — there's no 'verify environment works before vectorizing', no 'check training is converging before scaling', and no error recovery guidance. For a framework involving complex multi-step processes (environment creation → vectorization → training), this is a significant gap.	2 / 3
Progressive Disclosure	The skill references multiple external files (references/training.md, references/environments.md, etc.) and scripts, which is good structure. However, no bundle files are provided, so we can't verify these exist. The main file itself is bloated with content that should be in those reference files (e.g., the full Resources section re-describing each file, the 10 tips, the Common Use Cases section). The inline content doesn't achieve a clean overview-to-detail split.	2 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata.version' is missing	Warning

	Total	10 / 11 Passed

Repository: K-Dense-AI/claude-scientific-skills
Commit: cbcae7b

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.