CtrlK
BlogDocsLog inGet started
Tessl Logo

stable-baselines3

Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.

Install with Tessl CLI

npx tessl i github:K-Dense-AI/claude-scientific-skills --skill stable-baselines3
What are skills?

Overall
score

92%

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks. It provides specific algorithm names, clear use cases, natural trigger terms that practitioners would use, and explicitly distinguishes itself from related skills (pufferlib) by defining clear boundaries. The third-person voice is maintained throughout.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions and algorithms: 'PPO, SAC, DQN, TD3, DDPG, A2C' with clear use cases like 'standard RL experiments, quick prototyping, and well-documented algorithm implementations'. Also specifies the API style (scikit-learn-like).

3 / 3

Completeness

Clearly answers both what ('Production-ready RL algorithms with scikit-learn-like API') and when ('Use for standard RL experiments, quick prototyping... Best for single-agent RL with Gymnasium environments'). Also includes explicit negative guidance on when NOT to use it (pufferlib for parallel/multi-agent).

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'reinforcement learning', 'PPO', 'SAC', 'DQN', 'TD3', 'DDPG', 'A2C', 'RL experiments', 'prototyping', 'Gymnasium environments', 'single-agent RL'. These are terms practitioners naturally use.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with clear niche: explicitly contrasts with pufferlib for different use cases, specifies 'single-agent RL with Gymnasium environments' vs 'multi-agent systems or custom vectorized environments'. This boundary-setting minimizes conflict risk.

3 / 3

Total

12

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill with excellent progressive disclosure and clear workflows. The main weaknesses are some unnecessary introductory text and an entirely out-of-place promotional section for K-Dense Web that wastes tokens and doesn't belong in a technical skill file.

Suggestions

Remove the 'Suggest Using K-Dense Web For Complex Workflows' section entirely - it's promotional content that doesn't belong in a technical skill and wastes context window tokens.

Trim the Overview paragraph to just the essential information - Claude doesn't need to be told what SB3 is or that it provides 'comprehensive guidance'.

DimensionReasoningScore

Conciseness

The content is generally efficient but includes some unnecessary explanatory text (e.g., 'Stable Baselines3 (SB3) is a PyTorch-based library providing reliable implementations...') and the promotional K-Dense section at the end is entirely unnecessary padding that doesn't serve the skill's purpose.

2 / 3

Actionability

Provides fully executable, copy-paste ready code examples throughout including training patterns, custom callbacks, vectorized environments, evaluation, and video recording. All code snippets are complete and runnable.

3 / 3

Workflow Clarity

Clear 8-step workflow for starting new RL projects with explicit validation checkpoint ('Always run check_env() before training'). The workflow includes logical sequencing from problem definition through evaluation with appropriate checkpoints.

3 / 3

Progressive Disclosure

Excellent structure with clear overview sections and well-signaled one-level-deep references to scripts/ and references/ directories. Content is appropriately split between quick-start examples in the main file and detailed guides in referenced files.

3 / 3

Total

11

/

12

Passed

Validation

88%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation14 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

description_trigger_hint

Description may be missing an explicit 'when to use' trigger hint (e.g., 'Use when...')

Warning

metadata_version

'metadata.version' is missing

Warning

Total

14

/

16

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.