pufferlib

High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.

1.50x

Quality

67%

Does it follow best practices?

Impact

87%

1.50x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/pufferlib/SKILL.md

Evaluation results

62%

Robot Navigation Environment

PufferEnv custom environment implementation

Criteria

Without context

With context

buf parameter

100%

super().__init__(buf) call

100%

Space via make_space or make_discrete

Action space via make_discrete

reset() called in __init__

In-place state updates

58%

100%

Pre-allocated observation buffer

83%

100%

PufferEnv inheritance

100%

step() return signature

No obs.copy() in observation return

100%

Recurrent Agent Policy for Partially Observable Environment

Recurrent policy with LSTM optimization and layer initialization

Criteria

Without context

With context

layer_init import

100%

layer_init on linear layers

100%

layer_init on conv layers

100%

Actor head std=0.01

100%

Critic head std=1.0

100%

LSTMCell for inference

100%

LSTM for batch training

100%

Pixel normalization

100%

Separate inference and training methods

100%

Shared LSTM parameters

100%

81%

RL Training Pipeline for Procgen Benchmark

PuffeRL training pipeline with logging and hyperparameter configuration

Criteria

Without context

With context

PuffeRL trainer used

100%

evaluate() in loop

100%

train() in loop

100%

mean_and_log() in loop

100%

pufferlib.make() for environment

50%

100%

pufferlib logger class

100%

NoLogger fallback

100%

batch_size value

60%

100%

layer_init in policy

100%

compile parameter

100%

training_config.json present

100%

Repository: K-Dense-AI/claude-scientific-skills
Commit: cbcae7b

Evaluated: 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Robot Navigation Environment Recurrent Agent Policy for Partially Observable Environment RL Training Pipeline for Procgen Benchmark

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.