Tessl Tile for pypi/rl-zoo3@2.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

callbacks.md core-utilities.md experiment-management.md hub-integration.md hyperparameter-optimization.md index.md plotting.md wrappers.md

plotting.mddocs/

0
# Plotting and Visualization
1

2
Comprehensive plotting tools for training curves, evaluation results, and performance analysis. Provides functions for creating publication-quality plots from training logs, comparing algorithms, and visualizing learning progress.
3

4
## Core Imports
5

6
```python
7
from rl_zoo3.plots import plot_train, plot_from_file, all_plots
8
from rl_zoo3.plots.score_normalization import normalize_score
9
from rl_zoo3.plots.plot_from_file import restyle_boxplot
10
import numpy as np
11
```
12

13
## Capabilities
14

15
### Training Curve Plotting
16

17
Plot training progress curves from Tensorboard logs and training data.
18

19
```python { .api }
20
def plot_train() -> None:
21
    """
22
    Plot training curves from monitor logs.
23
    
24
    Command-line interface for plotting training progress including:
25
    - Episode rewards over time  
26
    - Episode lengths
27
    - Success rates (if applicable)
28
    - Learning curves with rolling window smoothing
29
    
30
    Reads from monitor log files and generates matplotlib plots.
31
    Supports customization of x-axis (steps/episodes/time), y-axis metrics,
32
    figure size, fonts, and rolling window size.
33
    """
34
```
35

36
Usage example:
37
```bash
38
# Command line usage
39
rl_zoo3 plot_train --log-dir ./logs --env CartPole-v1 --algo ppo
40

41
# Or programmatically
42
from rl_zoo3.plots import plot_train
43
import sys
44

45
# Set command line arguments
46
sys.argv = [
47
    'plot_train',
48
    '--log-dir', './logs',
49
    '--env', 'CartPole-v1',
50
    '--algo', 'ppo',
51
    '--smooth', '10'
52
]
53

54
plot_train()
55
```
56

57
### File-Based Plotting
58

59
Create plots from saved log files and evaluation results.
60

61
```python { .api }
62
def plot_from_file() -> None:
63
    """
64
    Plot results from saved evaluation files.
65
    
66
    Command-line interface for creating plots from evaluation results stored in:
67
    - Numpy archive files (.npz)
68
    - Pickle files with evaluation data
69
    - Post-processed experimental results
70
    
71
    Supports advanced statistical visualization including:
72
    - Box plots for performance distributions
73
    - Learning curves with confidence intervals
74
    - Algorithm comparison plots
75
    - Publication-quality figures with customizable styling
76
    """
77
```
78

79
Usage example:
80
```bash
81
# Command line usage
82
rl_zoo3 plot_from_file --log-dir ./eval_logs --output ./plots
83

84
# Programmatic usage
85
from rl_zoo3.plots import plot_from_file
86
import sys
87

88
sys.argv = [
89
    'plot_from_file',
90
    '--log-dir', './eval_logs',
91
    '--output-dir', './plots',
92
    '--format', 'png'
93
]
94

95
plot_from_file()
96
```
97

98
### Comprehensive Plotting
99

100
Generate all available plots for a complete analysis.
101

102
```python { .api }
103
def all_plots() -> None:
104
    """
105
    Generate comprehensive analysis plots from experimental results.
106
    
107
    Command-line interface that creates:
108
    - Algorithm comparison plots across environments
109
    - Statistical performance summaries
110
    - Learning curves with confidence intervals
111
    - Experiment matrices and correlation analysis
112
    - Publication-ready figures and tables
113
    
114
    Processes experimental results from multiple algorithms and environments
115
    to create a complete analysis suite for research papers and reports.
116
    """
117
```
118

119
Usage example:
120
```bash
121
# Generate all plots
122
rl_zoo3 all_plots --log-dir ./logs --output-dir ./plots --env CartPole-v1
123

124
# Programmatic usage
125
from rl_zoo3.plots.all_plots import all_plots
126
import sys
127

128
sys.argv = [
129
    'all_plots',
130
    '--log-dir', './logs',
131
    '--output-dir', './analysis_plots',
132
    '--env', 'CartPole-v1',
133
    '--algo', 'ppo'
134
]
135

136
all_plots()
137
```
138

139
### Score Normalization
140

141
Normalize performance scores across different environments for fair comparison.
142

143
```python { .api }
144
def normalize_score(score: np.ndarray, env_id: str) -> np.ndarray:
145
    """
146
    Normalize scores for cross-environment comparison.
147
    
148
    Parameters:
149
    - score: Array of raw scores/rewards
150
    - env_id: Environment identifier for normalization reference
151
    
152
    Returns:
153
    np.ndarray: Normalized scores (typically 0-100 scale)
154
    
155
    Uses environment-specific reference scores to normalize performance,
156
    enabling fair comparison across different environments with varying
157
    reward scales and difficulty levels.
158
    """
159
```
160

161
```python { .api }
162
class ReferenceScore(NamedTuple):
163
    """
164
    Reference score data structure for normalization.
165
    
166
    Attributes:
167
    - env_id: Environment identifier
168
    - min_score: Minimum reference score (random policy)
169
    - max_score: Maximum reference score (expert/optimal policy)
170
    """
171
    env_id: str
172
    min_score: float
173
    max_score: float
174
```
175

176
Usage example:
177
```python
178
import numpy as np
179
from rl_zoo3.plots.score_normalization import normalize_score
180

181
# Raw scores from different environments
182
cartpole_scores = np.array([180, 200, 195, 210, 175])
183
pendulum_scores = np.array([-150, -120, -130, -110, -140])
184

185
# Normalize for comparison
186
cartpole_normalized = normalize_score(cartpole_scores, "CartPole-v1")
187
pendulum_normalized = normalize_score(pendulum_scores, "Pendulum-v1")
188

189
print("CartPole normalized:", cartpole_normalized)
190
print("Pendulum normalized:", pendulum_normalized)
191

192
# Now scores are comparable across environments
193
average_performance = (cartpole_normalized.mean() + pendulum_normalized.mean()) / 2
194
print(f"Average normalized performance: {average_performance:.2f}")
195
```
196

197
### Utility Functions
198

199
Helper functions for plot styling and data processing.
200

201
```python { .api }
202
def restyle_boxplot(
203
    artist_dict: dict,
204
    color: str,
205
    gray: str = "#222222",
206
    linewidth: int = 1,
207
    fliersize: int = 5
208
) -> None:
209
    """
210
    Restyle boxplot appearance for publication quality.
211
    
212
    Parameters:
213
    - artist_dict: Dictionary of boxplot artists from matplotlib
214
    - color: Primary color for the boxplot
215
    - gray: Color for secondary elements (lines, whiskers, etc.)
216
    - linewidth: Width of plot lines
217
    - fliersize: Size of outlier markers
218
    
219
    Modifies boxplot styling in-place for consistent, professional appearance
220
    across all plots generated by RL Zoo3.
221
    """
222
```
223

224
## Advanced Plotting Examples
225

226
### Multi-Algorithm Comparison
227

228
```python
229
import matplotlib.pyplot as plt
230
import numpy as np
231
from rl_zoo3.plots.score_normalization import normalize_score
232

233
# Load results from multiple algorithms
234
algorithms = ['ppo', 'sac', 'td3', 'dqn']
235
env_id = "HalfCheetah-v4"
236

237
# Simulate loading results (replace with actual data loading)
238
results = {
239
    'ppo': np.random.normal(3000, 500, 10),
240
    'sac': np.random.normal(3500, 400, 10), 
241
    'td3': np.random.normal(3200, 600, 10),
242
    'dqn': np.random.normal(2800, 700, 10)
243
}
244

245
# Normalize scores for fair comparison
246
normalized_results = {}
247
for algo, scores in results.items():
248
    normalized_results[algo] = normalize_score(scores, env_id)
249

250
# Create comparison plot
251
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
252

253
# Raw scores
254
ax1.boxplot([results[algo] for algo in algorithms], labels=algorithms)
255
ax1.set_title("Raw Scores")
256
ax1.set_ylabel("Episode Return")
257

258
# Normalized scores
259
ax2.boxplot([normalized_results[algo] for algo in algorithms], labels=algorithms)
260
ax2.set_title("Normalized Scores")
261
ax2.set_ylabel("Normalized Performance (0-100)")
262

263
plt.tight_layout()
264
plt.savefig("algorithm_comparison.png", dpi=300, bbox_inches='tight')
265
plt.show()
266
```
267

268
### Training Progress Analysis
269

270
```python
271
import pandas as pd
272
import matplotlib.pyplot as plt
273
import seaborn as sns
274
from pathlib import Path
275

276
def analyze_training_progress(log_dir: str, env_id: str, algo: str):
277
    """
278
    Analyze and plot training progress from log files.
279
    """
280
    log_path = Path(log_dir) / algo / env_id
281
    
282
    # Load training data (example structure)
283
    # In practice, you'd load from actual log files
284
    timesteps = np.arange(0, 100000, 1000)
285
    episode_rewards = np.random.normal(150, 30, len(timesteps)) + \
286
                     50 * np.log(timesteps + 1) / np.log(10)  # Simulated learning
287
    
288
    # Add noise and occasional drops (realistic training curves)
289
    episode_rewards += np.random.normal(0, 10, len(timesteps))
290
    
291
    # Create comprehensive training plot
292
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
293
    
294
    # Learning curve
295
    axes[0, 0].plot(timesteps, episode_rewards, alpha=0.7, label='Episode Reward')
296
    
297
    # Add smoothed curve
298
    window = 10
299
    smoothed = pd.Series(episode_rewards).rolling(window).mean()
300
    axes[0, 0].plot(timesteps, smoothed, color='red', linewidth=2, label=f'Smoothed ({window})')
301
    
302
    axes[0, 0].set_xlabel('Timesteps')
303
    axes[0, 0].set_ylabel('Episode Reward')
304
    axes[0, 0].set_title(f'{algo.upper()} Learning Curve - {env_id}')
305
    axes[0, 0].legend()
306
    axes[0, 0].grid(True, alpha=0.3)
307
    
308
    # Reward distribution over time
309
    # Split into early, middle, late training
310
    early = episode_rewards[:len(episode_rewards)//3]
311
    middle = episode_rewards[len(episode_rewards)//3:2*len(episode_rewards)//3]
312
    late = episode_rewards[2*len(episode_rewards)//3:]
313
    
314
    axes[0, 1].boxplot([early, middle, late], labels=['Early', 'Middle', 'Late'])
315
    axes[0, 1].set_title('Reward Distribution by Training Phase')
316
    axes[0, 1].set_ylabel('Episode Reward')
317
    
318
    # Improvement rate
319
    improvement = np.gradient(smoothed.dropna())
320
    axes[1, 0].plot(timesteps[window-1:], improvement, alpha=0.7)
321
    axes[1, 0].axhline(y=0, color='r', linestyle='--', alpha=0.5)
322
    axes[1, 0].set_xlabel('Timesteps')
323
    axes[1, 0].set_ylabel('Improvement Rate')
324
    axes[1, 0].set_title('Learning Rate Over Time')
325
    axes[1, 0].grid(True, alpha=0.3)
326
    
327
    # Final performance histogram
328
    final_episodes = episode_rewards[-20:]  # Last 20 episodes
329
    axes[1, 1].hist(final_episodes, bins=10, alpha=0.7, edgecolor='black')
330
    axes[1, 1].axvline(final_episodes.mean(), color='red', linestyle='--', 
331
                      label=f'Mean: {final_episodes.mean():.1f}')
332
    axes[1, 1].set_xlabel('Episode Reward')
333
    axes[1, 1].set_ylabel('Frequency')
334
    axes[1, 1].set_title('Final Performance Distribution')
335
    axes[1, 1].legend()
336
    
337
    plt.tight_layout()
338
    plt.savefig(f"{algo}_{env_id}_analysis.png", dpi=300, bbox_inches='tight')
339
    plt.show()
340

341
# Use the analysis function
342
analyze_training_progress("./logs", "CartPole-v1", "ppo")
343
```
344

345
### Hyperparameter Sensitivity Analysis
346

347
```python
348
import matplotlib.pyplot as plt
349
import numpy as np
350
from itertools import product
351

352
def plot_hyperparameter_sensitivity():
353
    """
354
    Plot how performance varies with different hyperparameters.
355
    """
356
    # Example: PPO learning rate vs clip range sensitivity
357
    learning_rates = [1e-4, 3e-4, 1e-3, 3e-3]
358
    clip_ranges = [0.1, 0.2, 0.3, 0.4]
359
    
360
    # Simulate performance data (replace with actual results)
361
    performance_matrix = np.random.normal(180, 20, (len(learning_rates), len(clip_ranges)))
362
    
363
    # Add realistic patterns - lower LR generally more stable
364
    for i, lr in enumerate(learning_rates):
365
        for j, clip in enumerate(clip_ranges):
366
            # Simulate that moderate values work better
367
            lr_penalty = abs(np.log10(lr) + 3.5) * 10  # Penalty for extreme LR
368
            clip_penalty = abs(clip - 0.2) * 50  # Penalty for extreme clip range
369
            performance_matrix[i, j] -= (lr_penalty + clip_penalty)
370
    
371
    # Create heatmap
372
    fig, ax = plt.subplots(figsize=(10, 8))
373
    
374
    im = ax.imshow(performance_matrix, cmap='viridis', aspect='auto')
375
    
376
    # Set ticks and labels
377
    ax.set_xticks(range(len(clip_ranges)))
378
    ax.set_yticks(range(len(learning_rates)))
379
    ax.set_xticklabels([f"{cr:.1f}" for cr in clip_ranges])
380
    ax.set_yticklabels([f"{lr:.0e}" for lr in learning_rates])
381
    
382
    ax.set_xlabel('Clip Range')
383
    ax.set_ylabel('Learning Rate')
384
    ax.set_title('PPO Hyperparameter Sensitivity\n(CartPole-v1 Performance)')
385
    
386
    # Add colorbar
387
    cbar = plt.colorbar(im, ax=ax)
388
    cbar.set_label('Average Episode Reward')
389
    
390
    # Add text annotations
391
    for i in range(len(learning_rates)):
392
        for j in range(len(clip_ranges)):
393
            text = ax.text(j, i, f'{performance_matrix[i, j]:.0f}',
394
                          ha="center", va="center", color="white", fontweight='bold')
395
    
396
    plt.tight_layout()
397
    plt.savefig("hyperparameter_sensitivity.png", dpi=300, bbox_inches='tight')
398
    plt.show()
399

400
plot_hyperparameter_sensitivity()
401
```
402

403
## Integration with Command Line Tools
404

405
All plotting functions are available through the RL Zoo3 command line interface:
406

407
```bash
408
# Plot training curves
409
rl_zoo3 plot_train --log-dir ./logs --env CartPole-v1 --algo ppo --smooth 10
410

411
# Plot from evaluation files
412
rl_zoo3 plot_from_file --log-dir ./eval_results --output-dir ./plots
413

414
# Generate all plots
415
rl_zoo3 all_plots --log-dir ./logs --output-dir ./analysis --env CartPole-v1
416

417
# With additional options
418
rl_zoo3 plot_train \
419
    --log-dir ./logs \
420
    --env CartPole-v1 \
421
    --algo ppo \
422
    --smooth 10 \
423
    --window 50 \
424
    --format png \
425
    --dpi 300
426
```
427

428
The plotting system integrates seamlessly with the RL Zoo3 training workflow, automatically generating visualizations from standard log formats and providing comprehensive analysis tools for RL experiments.

Version

Tile

Files

plotting.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

plotting.mddocs/