Tessl Tile for pypi/yellowbrick@1.5.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-yellowbrick

A suite of visual analysis and diagnostic tools for machine learning.

Workspace: tessl
Visibility: Public
Created: 2 months ago
Last updated: 2 months ago
Describes: pkg:pypi/yellowbrick@1.5.x

To install, run

npx @tessl/cli install tessl/pypi-yellowbrick@1.5.0

0
# Yellowbrick
1

2
A comprehensive machine learning visualization library that extends scikit-learn with publication-quality visualizations for machine learning model evaluation, selection, and interpretation. Yellowbrick provides visual diagnostic tools called "Visualizers" that combine scikit-learn with matplotlib to streamline the machine learning workflow from data exploration through model interpretation.
3

4
## Package Information
5

6
- **Package Name**: yellowbrick
7
- **Language**: Python
8
- **Installation**: `pip install yellowbrick`
9
- **Scikit-learn Integration**: Compatible with scikit-learn 0.20+
10
- **Dependencies**: matplotlib, scipy, scikit-learn, numpy
11

12
## Core Imports
13

14
```python
15
import yellowbrick
16
```
17

18
Direct imports from yellowbrick:
19

20
```python
21
from yellowbrick import ROCAUC, ClassBalance, ClassificationScoreVisualizer
22
from yellowbrick import anscombe, datasaurus
23
from yellowbrick import set_aesthetic, set_style, set_palette, color_palette
24
```
25

26
Common pattern for visualizers:
27

28
```python
29
from yellowbrick.classifier import ROCAUC, ConfusionMatrix
30
from yellowbrick.regressor import ResidualsPlot
31
from yellowbrick.cluster import KElbow
32
```
33

34
Functional API imports:
35

36
```python
37
from yellowbrick.classifier import roc_auc, confusion_matrix
38
from yellowbrick.regressor import residuals_plot
39
```
40

41
## Basic Usage
42

43
```python
44
from yellowbrick.classifier import ROCAUC
45
from sklearn.model_selection import train_test_split
46
from sklearn.linear_model import LogisticRegression
47
from sklearn.datasets import make_classification
48

49
# Generate sample data
50
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2)
51
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
52

53
# Create and fit model
54
model = LogisticRegression()
55

56
# Visualize ROC/AUC curves
57
visualizer = ROCAUC(model, classes=['Class 0', 'Class 1'])
58
visualizer.fit(X_train, y_train)
59
visualizer.score(X_test, y_test)
60
visualizer.show()
61

62
# Using functional API
63
from yellowbrick.classifier import roc_auc
64
roc_auc(model, X_train, y_train, X_test, y_test, classes=['Class 0', 'Class 1'])
65
```
66

67
## Architecture
68

69
Yellowbrick follows the scikit-learn API design with Visualizers that inherit from `sklearn.base.BaseEstimator`:
70

71
- **Base Classes**: `Visualizer`, `ModelVisualizer`, `ScoreVisualizer` provide the foundation
72
- **Visualizer Pattern**: All visualizers implement `fit()`, `score()`, and `show()` methods
73
- **Pipeline Integration**: Visualizers can be used in scikit-learn pipelines
74
- **Dual API**: Both class-based and functional APIs for flexibility
75
- **Matplotlib Integration**: Built on matplotlib with consistent styling and themes
76

77
## Capabilities
78

79
### Classification Analysis
80

81
Comprehensive visualizers for evaluating classification models including ROC curves, confusion matrices, classification reports, class prediction errors, precision-recall curves, and discrimination thresholds.
82

83
```python { .api }
84
class ROCAUC(ClassificationScoreVisualizer):
85
    def __init__(self, estimator, ax=None, micro=True, macro=True, per_class=True, binary=False, classes=None, encoder=None, is_fitted="auto", force_model=False, **kwargs): ...
86
    def fit(self, X, y, **kwargs): ...
87
    def score(self, X, y, **kwargs): ...
88

89
class ConfusionMatrix(ClassificationScoreVisualizer):
90
    def __init__(self, estimator, ax=None, sample_weight=None, percent=False, classes=None, encoder=None, cmap="YlOrRd", fontsize=None, is_fitted="auto", force_model=False, **kwargs): ...
91
    def fit(self, X, y, **kwargs): ...
92
    def score(self, X, y, **kwargs): ...
93

94
class ClassificationReport(ClassificationScoreVisualizer):
95
    def __init__(self, estimator, classes=None, **kwargs): ...
96
    def fit(self, X, y, **kwargs): ...
97
    def score(self, X, y, **kwargs): ...
98

99
# Functional APIs
100
def roc_auc(estimator, X_train, y_train, X_test=None, y_test=None, **kwargs): ...
101
def confusion_matrix(estimator, X_train, y_train, X_test=None, y_test=None, **kwargs): ...
102
def classification_report(estimator, X_train, y_train, X_test=None, y_test=None, **kwargs): ...
103
```
104

105
[Classification Analysis](./classification.md)
106

107
### Regression Analysis
108

109
Diagnostic visualizers for regression models including residuals plots, prediction error plots, alpha selection for regularized models, and Cook's distance for influence analysis.
110

111
```python { .api }
112
class ResidualsPlot(RegressionScoreVisualizer):
113
    def __init__(self, estimator, **kwargs): ...
114
    def fit(self, X, y, **kwargs): ...
115
    def score(self, X, y, **kwargs): ...
116

117
class PredictionError(RegressionScoreVisualizer):
118
    def __init__(self, estimator, **kwargs): ...
119
    def fit(self, X, y, **kwargs): ...
120
    def score(self, X, y, **kwargs): ...
121

122
class AlphaSelection(RegressionScoreVisualizer):
123
    def __init__(self, estimator, **kwargs): ...
124
    def fit(self, X, y, **kwargs): ...
125
    def score(self, X, y, **kwargs): ...
126

127
# Functional APIs
128
def residuals_plot(estimator, X_train, y_train, X_test=None, y_test=None, **kwargs): ...
129
def prediction_error(estimator, X_train, y_train, X_test=None, y_test=None, **kwargs): ...
130
```
131

132
[Regression Analysis](./regression.md)
133

134
### Clustering Analysis
135

136
Visualizers for clustering evaluation including elbow method for optimal K selection, silhouette analysis, and intercluster distance mapping.
137

138
```python { .api }
139
class KElbow(ClusteringScoreVisualizer):
140
    def __init__(self, estimator, k=10, metric='distortion', **kwargs): ...
141
    def fit(self, X, y=None, **kwargs): ...
142

143
class SilhouetteVisualizer(ClusteringScoreVisualizer):
144
    def __init__(self, estimator, **kwargs): ...
145
    def fit(self, X, y=None, **kwargs): ...
146

147
class InterclusterDistance(ClusteringScoreVisualizer):
148
    def __init__(self, estimator, **kwargs): ...
149
    def fit(self, X, y=None, **kwargs): ...
150

151
# Functional APIs
152
def kelbow_visualizer(estimator, X, k=10, **kwargs): ...
153
def silhouette_visualizer(estimator, X, **kwargs): ...
154
```
155

156
[Clustering Analysis](./clustering.md)
157

158
### Feature Analysis
159

160
Tools for feature selection, analysis, and visualization including feature ranking, correlation analysis, PCA decomposition, manifold learning, and parallel coordinates.
161

162
```python { .api }
163
class Rank1D(Visualizer):
164
    def __init__(self, algorithm='shapiro', **kwargs): ...
165
    def fit(self, X, y=None, **kwargs): ...
166

167
class Rank2D(Visualizer):
168
    def __init__(self, algorithm='pearson', **kwargs): ...
169
    def fit(self, X, y=None, **kwargs): ...
170

171
class PCA(Visualizer):
172
    def __init__(self, scale=True, proj_features=True, **kwargs): ...
173
    def fit(self, X, y=None, **kwargs): ...
174

175
class ParallelCoordinates(Visualizer):
176
    def __init__(self, classes=None, **kwargs): ...
177
    def fit(self, X, y=None, **kwargs): ...
178

179
# Functional APIs
180
def rank1d(X, y=None, algorithm='shapiro', **kwargs): ...
181
def rank2d(X, y=None, algorithm='pearson', **kwargs): ...
182
def pca_decomposition(X, y=None, **kwargs): ...
183
```
184

185
[Feature Analysis](./features.md)
186

187
### Model Selection
188

189
Visualizers for model selection and hyperparameter tuning including learning curves, validation curves, cross-validation scores, and feature importance analysis.
190

191
```python { .api }
192
class LearningCurve(ModelVisualizer):
193
    def __init__(self, estimator, **kwargs): ...
194
    def fit(self, X, y, **kwargs): ...
195

196
class ValidationCurve(ModelVisualizer):
197
    def __init__(self, estimator, param_name, param_range, **kwargs): ...
198
    def fit(self, X, y, **kwargs): ...
199

200
class FeatureImportances(ModelVisualizer):
201
    def __init__(self, estimator, **kwargs): ...
202
    def fit(self, X, y, **kwargs): ...
203

204
class CVScores(ModelVisualizer):
205
    def __init__(self, estimator, **kwargs): ...
206
    def fit(self, X, y, **kwargs): ...
207

208
# Functional APIs
209
def learning_curve(estimator, X, y, **kwargs): ...
210
def validation_curve(estimator, X, y, param_name, param_range, **kwargs): ...
211
def feature_importances(estimator, X, y, **kwargs): ...
212
```
213

214
[Model Selection](./model-selection.md)
215

216
### Text Analysis
217

218
Specialized visualizers for text analysis and natural language processing including t-SNE/UMAP embeddings, frequency distributions, part-of-speech analysis, and word correlation plots.
219

220
```python { .api }
221
class TSNEVisualizer(Visualizer):
222
    def __init__(self, **kwargs): ...
223
    def fit(self, X, y=None, **kwargs): ...
224

225
class FreqDistVisualizer(Visualizer):
226
    def __init__(self, **kwargs): ...
227
    def fit(self, corpus, **kwargs): ...
228

229
class DispersionPlot(Visualizer):
230
    def __init__(self, **kwargs): ...
231
    def fit(self, corpus, **kwargs): ...
232

233
# Functional APIs
234
def tsne(X, y=None, **kwargs): ...
235
def freqdist(corpus, **kwargs): ...
236
def dispersion(corpus, **kwargs): ...
237
```
238

239
[Text Analysis](./text.md)
240

241
### Data Loading and Utilities
242

243
Built-in datasets for learning and testing, plus utility functions for data management and visualization styling.
244

245
```python { .api }
246
# Dataset loaders
247
def load_concrete(): ...
248
def load_energy(): ...
249
def load_credit(): ...
250
def load_occupancy(): ...
251
def load_mushroom(): ...
252
def load_hobbies(): ...
253
def load_bikeshare(): ...
254

255
# Style management
256
def set_aesthetic(aesthetic='whitegrid'): ...
257
def set_palette(palette='flatui'): ...
258
def color_palette(palette=None): ...
259

260
# Demo functions
261
def anscombe(): ...
262
def datasaurus(): ...
263
```
264

265
[Data Loading and Utilities](./data-utilities.md)
266

267
## Types
268

269
```python { .api }
270
from enum import Enum
271

272
class TargetType(Enum):
273
    AUTO = "auto"
274
    SINGLE = "single" 
275
    DISCRETE = "discrete"
276
    CONTINUOUS = "continuous"
277
    UNKNOWN = "unknown"
278

279
# Base visualizer classes
280
class Visualizer:
281
    def __init__(self, ax=None, fig=None, size=None, color=None, title=None, **kwargs): ...
282
    def fit(self, X, y=None, **kwargs): ...
283
    def transform(self, X): ...
284
    def show(self, outpath=None, **kwargs): ...
285
    def finalize(self, **kwargs): ...
286

287
class ModelVisualizer(Visualizer):
288
    def __init__(self, estimator, ax=None, fig=None, is_fitted="auto", **kwargs): ...
289

290
class ScoreVisualizer(ModelVisualizer):
291
    def score(self, X, y, **kwargs): ...
292
```