High-performance reinforcement learning framework optimized for speed and scale. Use when you need fast parallel training, vectorized environments, multi-agent systems, or integration with game environments (Atari, Procgen, NetHack). Achieves 2-10x speedups over standard implementations. For quick prototyping or standard algorithm implementations with extensive documentation, use stable-baselines3 instead.
78
71%
Does it follow best practices?
Impact
87%
1.50xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/pufferlib/SKILL.mdPufferEnv custom environment implementation
buf parameter
100%
100%
super().__init__(buf) call
100%
100%
Space via make_space or make_discrete
0%
0%
Action space via make_discrete
0%
0%
reset() called in __init__
0%
0%
In-place state updates
58%
100%
Pre-allocated observation buffer
83%
100%
PufferEnv inheritance
100%
100%
step() return signature
0%
0%
No obs.copy() in observation return
100%
100%
Recurrent policy with LSTM optimization and layer initialization
layer_init import
100%
100%
layer_init on linear layers
100%
100%
layer_init on conv layers
100%
100%
Actor head std=0.01
100%
100%
Critic head std=1.0
100%
100%
LSTMCell for inference
100%
100%
LSTM for batch training
100%
100%
Pixel normalization
100%
100%
Separate inference and training methods
100%
100%
Shared LSTM parameters
100%
100%
PuffeRL training pipeline with logging and hyperparameter configuration
PuffeRL trainer used
0%
100%
evaluate() in loop
0%
100%
train() in loop
0%
100%
mean_and_log() in loop
0%
100%
pufferlib.make() for environment
50%
100%
pufferlib logger class
0%
100%
NoLogger fallback
0%
100%
batch_size value
60%
100%
layer_init in policy
0%
100%
compile parameter
0%
100%
training_config.json present
100%
100%
b58ad7e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.