Tessl Tile for pypi/rl-zoo3@2.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

callbacks.md core-utilities.md experiment-management.md hub-integration.md hyperparameter-optimization.md index.md plotting.md wrappers.md

hub-integration.mddocs/

0
# HuggingFace Hub Integration
1

2
Model sharing and loading through HuggingFace Hub integration. Enables uploading trained models, downloading pre-trained models, and generating model cards for the RL community ecosystem.
3

4
## Core Imports
5

6
```python
7
from rl_zoo3.push_to_hub import package_to_hub, generate_model_card
8
from rl_zoo3.load_from_hub import download_from_hub
9
from stable_baselines3.common.base_class import BaseAlgorithm
10
from typing import Optional, Any, dict
11
```
12

13
## Capabilities
14

15
### Model Upload and Packaging
16

17
Upload trained models to HuggingFace Hub with comprehensive metadata and documentation.
18

19
```python { .api }
20
def package_to_hub(
21
    model: BaseAlgorithm,
22
    model_name: str,
23
    repo_id: str,
24
    commit_message: str = "Add model",
25
    tags: Optional[list[str]] = None,
26
    local_repo_path: Optional[str] = None,
27
    model_architecture: Optional[str] = None,
28
    env_id: Optional[str] = None,
29
    eval_env: Optional[VecEnv] = None,
30
    n_eval_episodes: int = 10,
31
    deterministic: bool = True,
32
    use_auth_token: Optional[Union[bool, str]] = None,
33
    private: bool = False,
34
    **kwargs
35
) -> str:
36
    """
37
    Package and upload a trained model to HuggingFace Hub.
38
    
39
    Parameters:
40
    - model: Trained RL model to upload
41
    - model_name: Name for the model
42
    - repo_id: HuggingFace repository ID (e.g., "username/model-name")
43
    - commit_message: Git commit message for the upload
44
    - tags: List of tags for model categorization
45
    - local_repo_path: Local path for temporary repository
46
    - model_architecture: Architecture description
47
    - env_id: Environment identifier
48
    - eval_env: Environment for evaluation before upload
49
    - n_eval_episodes: Number of evaluation episodes
50
    - deterministic: Whether to use deterministic actions for evaluation
51
    - use_auth_token: HuggingFace authentication token
52
    - private: Whether to create a private repository
53
    - **kwargs: Additional keyword arguments
54
    
55
    Returns:
56
    str: URL of the uploaded model repository
57
    """
58
```
59

60
Usage example:
61
```python
62
from rl_zoo3.push_to_hub import package_to_hub
63
from rl_zoo3 import create_test_env
64
from stable_baselines3 import PPO
65

66
# Train a model
67
env = create_test_env("CartPole-v1", n_envs=1)
68
model = PPO("MlpPolicy", env, verbose=1)
69
model.learn(total_timesteps=20000)
70

71
# Create evaluation environment
72
eval_env = create_test_env("CartPole-v1", n_envs=1)
73

74
# Upload to HuggingFace Hub
75
repo_url = package_to_hub(
76
    model=model,
77
    model_name="ppo-cartpole-v1",
78
    repo_id="your-username/ppo-cartpole-v1",
79
    commit_message="Upload trained PPO agent for CartPole-v1",
80
    tags=["ppo", "cartpole", "reinforcement-learning"],
81
    env_id="CartPole-v1",
82
    eval_env=eval_env,
83
    n_eval_episodes=10,
84
    deterministic=True
85
)
86

87
print(f"Model uploaded to: {repo_url}")
88
```
89

90
### Model Download and Loading
91

92
Download and load pre-trained models from HuggingFace Hub.
93

94
```python { .api }
95
def download_from_hub(
96
    repo_id: str,
97
    filename: str,
98
    force_download: bool = False,
99
    local_dir: Optional[str] = None,
100
    **kwargs
101
) -> str:
102
    """
103
    Download a model file from HuggingFace Hub.
104
    
105
    Parameters:
106
    - repo_id: HuggingFace repository ID
107
    - filename: Name of the file to download
108
    - force_download: Whether to force re-download
109
    - local_dir: Local directory to save the file
110
    - **kwargs: Additional download arguments
111
    
112
    Returns:
113
    str: Path to the downloaded file
114
    """
115
```
116

117
Usage example:
118
```python
119
from rl_zoo3.load_from_hub import download_from_hub
120
from rl_zoo3 import ALGOS, create_test_env
121

122
# Download a pre-trained model
123
model_path = download_from_hub(
124
    repo_id="sb3/ppo-CartPole-v1",
125
    filename="ppo-CartPole-v1.zip"
126
)
127

128
# Load the model
129
model = ALGOS["ppo"].load(model_path)
130

131
# Test the model
132
env = create_test_env("CartPole-v1", n_envs=1)
133
obs = env.reset()
134
for _ in range(1000):
135
    action, _states = model.predict(obs, deterministic=True)
136
    obs, rewards, dones, info = env.step(action)
137
    
138
    if dones.any():
139
        obs = env.reset()
140
```
141

142
### Model Card Generation
143

144
Generate comprehensive model cards with training information, evaluation results, and usage instructions.
145

146
```python { .api }
147
def generate_model_card(
148
    model: BaseAlgorithm,
149
    env_id: str,
150
    model_name: str = "",
151
    repo_id: str = "",
152
    eval_results: Optional[dict] = None,
153
    training_time: Optional[float] = None,
154
    total_timesteps: Optional[int] = None,
155
    hyperparams: Optional[dict] = None,
156
    model_architecture: Optional[str] = None,
157
    **kwargs
158
) -> str:
159
    """
160
    Generate a model card for a trained RL agent.
161
    
162
    Parameters:
163
    - model: Trained RL model
164
    - env_id: Environment identifier
165
    - model_name: Display name for the model
166
    - repo_id: Repository identifier
167
    - eval_results: Dictionary of evaluation results
168
    - training_time: Total training time in seconds
169
    - total_timesteps: Total training timesteps
170
    - hyperparams: Model hyperparameters
171
    - model_architecture: Description of model architecture
172
    - **kwargs: Additional metadata
173
    
174
    Returns:
175
    str: Generated model card in Markdown format
176
    """
177
```
178

179
```python { .api }
180
def save_model_card(
181
    repo_dir: Path,
182
    generated_model_card: str,
183
    metadata: dict[str, Any]
184
) -> None:
185
    """
186
    Save a generated model card to a repository directory.
187
    
188
    Parameters:
189
    - repo_dir: Repository directory path
190
    - generated_model_card: Generated model card content
191
    - metadata: Additional metadata for the model card
192
    """
193
```
194

195
Usage example:
196
```python
197
from rl_zoo3.push_to_hub import generate_model_card, save_model_card
198
from pathlib import Path
199

200
# Generate model card
201
model_card = generate_model_card(
202
    model=model,
203
    env_id="CartPole-v1",
204
    model_name="PPO Agent for CartPole",
205
    repo_id="your-username/ppo-cartpole-v1",
206
    eval_results={
207
        "mean_reward": 195.2,
208
        "std_reward": 12.5,
209
        "n_eval_episodes": 10
210
    },
211
    training_time=300.5,
212
    total_timesteps=20000,
213
    hyperparams={
214
        "learning_rate": 0.0003,
215
        "n_steps": 2048,
216
        "batch_size": 64,
217
        "n_epochs": 10
218
    },
219
    model_architecture="MlpPolicy with [64, 64] hidden layers"
220
)
221

222
# Save model card
223
repo_dir = Path("./model_repo")
224
repo_dir.mkdir(exist_ok=True)
225

226
save_model_card(
227
    repo_dir=repo_dir,
228
    generated_model_card=model_card,
229
    metadata={"framework": "stable-baselines3", "library": "rl-zoo3"}
230
)
231

232
print("Model card saved to README.md")
233
```
234

235
## Complete Workflow Examples
236

237
### End-to-End Model Sharing
238

239
```python
240
from rl_zoo3.exp_manager import ExperimentManager
241
from rl_zoo3.push_to_hub import package_to_hub
242
from rl_zoo3 import create_test_env
243
import argparse
244

245
def train_and_share_model():
246
    """
247
    Complete workflow: train model, evaluate, and share on Hub.
248
    """
249
    # 1. Train the model
250
    args = argparse.Namespace(
251
        algo='sac',
252
        env='Pendulum-v1',
253
        n_timesteps=50000,
254
        eval_freq=5000,
255
        n_eval_episodes=10,
256
        verbose=1,
257
        seed=42
258
    )
259
    
260
    exp_manager = ExperimentManager(
261
        args=args,
262
        algo='sac',
263
        env_id='Pendulum-v1',
264
        log_folder='./logs',
265
        n_timesteps=50000,
266
        eval_freq=5000,
267
        seed=42
268
    )
269
    
270
    # Setup and train
271
    model = exp_manager.setup_experiment()
272
    exp_manager.learn(model)
273
    exp_manager.save_trained_model(model)
274
    
275
    # 2. Create evaluation environment
276
    eval_env = create_test_env("Pendulum-v1", n_envs=1)
277
    
278
    # 3. Upload to HuggingFace Hub
279
    repo_url = package_to_hub(
280
        model=model,
281
        model_name="sac-pendulum-v1",
282
        repo_id="your-username/sac-pendulum-v1",
283
        commit_message="Upload SAC agent for Pendulum-v1 (50k timesteps)",
284
        tags=["sac", "pendulum", "continuous-control", "rl-zoo3"],
285
        env_id="Pendulum-v1",
286
        eval_env=eval_env,
287
        n_eval_episodes=20,
288
        deterministic=True,
289
        model_architecture="SAC with default MlpPolicy"
290
    )
291
    
292
    print(f"Model successfully shared at: {repo_url}")
293
    return repo_url
294

295
# Run the complete workflow
296
train_and_share_model()
297
```
298

299
### Loading and Comparing Hub Models
300

301
```python
302
from rl_zoo3.load_from_hub import download_from_hub
303
from rl_zoo3 import ALGOS, create_test_env
304
import numpy as np
305

306
def compare_hub_models():
307
    """
308
    Download and compare multiple models from HuggingFace Hub.
309
    """
310
    # Models to compare
311
    models_to_test = [
312
        {"repo_id": "sb3/ppo-CartPole-v1", "filename": "ppo-CartPole-v1.zip", "algo": "ppo"},
313
        {"repo_id": "sb3/dqn-CartPole-v1", "filename": "dqn-CartPole-v1.zip", "algo": "dqn"},
314
        {"repo_id": "sb3/a2c-CartPole-v1", "filename": "a2c-CartPole-v1.zip", "algo": "a2c"}
315
    ]
316
    
317
    # Test environment
318
    env = create_test_env("CartPole-v1", n_envs=1)
319
    
320
    results = {}
321
    
322
    for model_info in models_to_test:
323
        print(f"Testing {model_info['algo'].upper()} model...")
324
        
325
        # Download model
326
        model_path = download_from_hub(
327
            repo_id=model_info["repo_id"],
328
            filename=model_info["filename"]
329
        )
330
        
331
        # Load model
332
        model = ALGOS[model_info["algo"]].load(model_path)
333
        
334
        # Evaluate model
335
        episode_rewards = []
336
        n_eval_episodes = 10
337
        
338
        for episode in range(n_eval_episodes):
339
            obs = env.reset()
340
            episode_reward = 0
341
            done = False
342
            
343
            while not done:
344
                action, _states = model.predict(obs, deterministic=True)
345
                obs, reward, done, info = env.step(action)
346
                episode_reward += reward[0]
347
            
348
            episode_rewards.append(episode_reward)
349
        
350
        # Store results
351
        results[model_info["algo"]] = {
352
            "mean_reward": np.mean(episode_rewards),
353
            "std_reward": np.std(episode_rewards),
354
            "episodes": episode_rewards
355
        }
356
        
357
        print(f"{model_info['algo'].upper()}: "
358
              f"{results[model_info['algo']]['mean_reward']:.1f} ± "
359
              f"{results[model_info['algo']]['std_reward']:.1f}")
360
    
361
    # Find best model
362
    best_algo = max(results.keys(), key=lambda k: results[k]["mean_reward"])
363
    print(f"\nBest model: {best_algo.upper()} "
364
          f"({results[best_algo]['mean_reward']:.1f} ± "
365
          f"{results[best_algo]['std_reward']:.1f})")
366
    
367
    return results
368

369
# Compare models
370
comparison_results = compare_hub_models()
371
```
372

373
### Automated Model Sharing Pipeline
374

375
```python
376
from rl_zoo3.exp_manager import ExperimentManager
377
from rl_zoo3.push_to_hub import package_to_hub
378
from rl_zoo3 import create_test_env
379
import argparse
380
from pathlib import Path
381

382
class ModelSharingPipeline:
383
    """
384
    Automated pipeline for training and sharing models.
385
    """
386
    
387
    def __init__(self, username: str, auth_token: str):
388
        self.username = username
389
        self.auth_token = auth_token
390
    
391
    def train_and_share(
392
        self,
393
        algo: str,
394
        env_id: str,
395
        n_timesteps: int,
396
        description: str = "",
397
        tags: list[str] = None
398
    ):
399
        """
400
        Train a model and automatically share it on HuggingFace Hub.
401
        """
402
        if tags is None:
403
            tags = [algo, env_id.lower(), "rl-zoo3"]
404
        
405
        # Setup training
406
        args = argparse.Namespace(
407
            algo=algo,
408
            env=env_id,
409
            n_timesteps=n_timesteps,
410
            eval_freq=max(n_timesteps // 10, 1000),
411
            n_eval_episodes=10,
412
            verbose=1,
413
            seed=42
414
        )
415
        
416
        # Create unique log folder
417
        log_folder = f"./logs/{algo}_{env_id}_{n_timesteps}"
418
        
419
        exp_manager = ExperimentManager(
420
            args=args,
421
            algo=algo,
422
            env_id=env_id,
423
            log_folder=log_folder,
424
            n_timesteps=n_timesteps,
425
            eval_freq=args.eval_freq
426
        )
427
        
428
        # Train model
429
        print(f"Training {algo.upper()} on {env_id} for {n_timesteps} timesteps...")
430
        model = exp_manager.setup_experiment() 
431
        exp_manager.learn(model)
432
        exp_manager.save_trained_model(model)
433
        
434
        # Create evaluation environment
435
        eval_env = create_test_env(env_id, n_envs=1)
436
        
437
        # Generate repository name
438
        repo_name = f"{algo}-{env_id.lower()}-{n_timesteps//1000}k"
439
        repo_id = f"{self.username}/{repo_name}"
440
        
441
        # Upload to Hub
442
        print(f"Uploading to HuggingFace Hub: {repo_id}")
443
        repo_url = package_to_hub(
444
            model=model,
445
            model_name=repo_name,
446
            repo_id=repo_id,
447
            commit_message=f"Upload {algo.upper()} agent for {env_id} ({n_timesteps} timesteps)",
448
            tags=tags,
449
            env_id=env_id,
450
            eval_env=eval_env,
451
            n_eval_episodes=20,
452
            deterministic=True,
453
            use_auth_token=self.auth_token,
454
            model_architecture=f"{algo.upper()} with default policy"
455
        )
456
        
457
        print(f"✅ Model uploaded successfully: {repo_url}")
458
        return repo_url
459
    
460
    def batch_training(self, configs: list[dict]):
461
        """
462
        Train and share multiple models in batch.
463
        """
464
        results = []
465
        
466
        for config in configs:
467
            try:
468
                result = self.train_and_share(**config)
469
                results.append({"config": config, "url": result, "status": "success"})
470
            except Exception as e:
471
                print(f"❌ Failed to train/share {config}: {e}")
472
                results.append({"config": config, "error": str(e), "status": "failed"})
473
        
474
        return results
475

476
# Example usage
477
pipeline = ModelSharingPipeline(
478
    username="your-username",
479
    auth_token="your-hf-token"
480
)
481

482
# Single model
483
pipeline.train_and_share(
484
    algo="ppo",
485
    env_id="CartPole-v1",
486
    n_timesteps=25000,
487
    tags=["ppo", "cartpole", "classic-control", "rl-zoo3"]
488
)
489

490
# Batch training
491
batch_configs = [
492
    {"algo": "ppo", "env_id": "CartPole-v1", "n_timesteps": 25000},
493
    {"algo": "dqn", "env_id": "CartPole-v1", "n_timesteps": 25000},
494
    {"algo": "sac", "env_id": "Pendulum-v1", "n_timesteps": 50000}
495
]
496

497
batch_results = pipeline.batch_training(batch_configs)
498
print(f"Batch training completed. {len([r for r in batch_results if r['status'] == 'success'])} successes.")
499
```
500

501
## Hub Integration Features
502

503
The HuggingFace Hub integration provides:
504

505
- **Automatic model card generation** with training details, hyperparameters, and evaluation results
506
- **Model versioning** through Git-based repository system
507
- **Community sharing** enabling model discovery and reuse
508
- **Evaluation integration** with automatic performance benchmarking
509
- **Metadata preservation** including environment, algorithm, and training configuration
510
- **Download caching** for efficient model loading
511
- **Authentication handling** for private repositories and uploads
512

513
This integration makes RL Zoo3 models part of the broader ML community ecosystem, facilitating reproducible research and model sharing.

Version

Tile

Files

hub-integration.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

hub-integration.mddocs/