Tessl Tile for pypi/scikit-learn@1.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

datasets.md feature-extraction.md index.md metrics.md model-selection.md neighbors.md pipelines.md preprocessing.md supervised-learning.md unsupervised-learning.md utilities.md

utilities.mddocs/

0
# Utilities and Core Functions
1

2
This document covers core utilities, configuration functions, pipelines, composition tools, and other utility functions in scikit-learn.
3

4
## Core Utilities
5

6
### Base Functions
7

8
#### clone { .api }
9
```python
10
from sklearn.base import clone
11

12
clone(
13
    estimator: BaseEstimator,
14
    safe: bool = True
15
) -> BaseEstimator
16
```
17
Construct a new unfitted estimator with the same parameters.
18

19
### Configuration Functions
20

21
#### get_config { .api }
22
```python
23
from sklearn import get_config
24

25
get_config() -> dict
26
```
27
Retrieve current scikit-learn configuration.
28

29
#### set_config { .api }
30
```python
31
from sklearn import set_config
32

33
set_config(
34
    assume_finite: bool | None = None,
35
    working_memory: int | None = None,
36
    print_changed_only: bool | None = None,
37
    display: str | None = None,
38
    pairwise_distances_chunk_size: int | None = None,
39
    enable_cython_pairwise_dist: bool | None = None,
40
    array_api_dispatch: bool | None = None,
41
    transform_output: str | None = None,
42
    enable_metadata_routing: bool | None = None,
43
    skip_parameter_validation: bool | None = None
44
) -> dict
45
```
46
Set global scikit-learn configuration.
47

48
#### config_context { .api }
49
```python
50
from sklearn import config_context
51

52
config_context(**new_config) -> ContextManager
53
```
54
Temporarily change global configuration.
55

56
### Version Information
57

58
#### show_versions { .api }
59
```python
60
from sklearn import show_versions
61

62
show_versions() -> None
63
```
64
Print system and dependency version information.
65

66
#### __version__ { .api }
67
```python
68
import sklearn
69
sklearn.__version__  # "1.7.1"
70
```
71
Current scikit-learn version string.
72

73
## Pipeline
74

75
### Pipeline Classes
76

77
#### Pipeline { .api }
78
```python
79
from sklearn.pipeline import Pipeline
80

81
Pipeline(
82
    steps: list[tuple[str, BaseEstimator]],
83
    memory: str | object | None = None,
84
    verbose: bool = False
85
)
86
```
87
Pipeline of transforms with a final estimator.
88

89
#### FeatureUnion { .api }
90
```python
91
from sklearn.pipeline import FeatureUnion
92

93
FeatureUnion(
94
    transformer_list: list[tuple[str, BaseTransformer]],
95
    n_jobs: int | None = None,
96
    transformer_weights: dict | None = None,
97
    verbose: bool = False,
98
    verbose_feature_names_out: bool = True
99
)
100
```
101
Concatenates results of multiple transformer objects.
102

103
### Pipeline Functions
104

105
#### make_pipeline { .api }
106
```python
107
from sklearn.pipeline import make_pipeline
108

109
make_pipeline(
110
    *steps: BaseEstimator,
111
    memory: str | object | None = None,
112
    verbose: bool = False
113
) -> Pipeline
114
```
115
Construct a Pipeline from the given estimators.
116

117
#### make_union { .api }
118
```python
119
from sklearn.pipeline import make_union
120

121
make_union(
122
    *transformers: BaseTransformer,
123
    n_jobs: int | None = None,
124
    verbose: bool = False
125
) -> FeatureUnion
126
```
127
Construct a FeatureUnion from the given transformers.
128

129
## Compose
130

131
### Column Transformer
132

133
#### ColumnTransformer { .api }
134
```python
135
from sklearn.compose import ColumnTransformer
136

137
ColumnTransformer(
138
    transformers: list[tuple[str, BaseTransformer, ArrayLike | str | Callable]],
139
    remainder: str | BaseTransformer = "drop",
140
    sparse_threshold: float = 0.3,
141
    n_jobs: int | None = None,
142
    transformer_weights: dict | None = None,
143
    verbose: bool = False,
144
    verbose_feature_names_out: bool = True,
145
    force_int_remainder_cols: bool = True
146
)
147
```
148
Applies transformers to columns of an array or pandas DataFrame.
149

150
#### TransformedTargetRegressor { .api }
151
```python
152
from sklearn.compose import TransformedTargetRegressor
153

154
TransformedTargetRegressor(
155
    regressor: BaseRegressor | None = None,
156
    transformer: BaseTransformer | None = None,
157
    func: Callable | None = None,
158
    inverse_func: Callable | None = None,
159
    check_inverse: bool = True
160
)
161
```
162
Meta-estimator to regress on a transformed target.
163

164
### Compose Functions
165

166
#### make_column_transformer { .api }
167
```python
168
from sklearn.compose import make_column_transformer
169

170
make_column_transformer(
171
    *transformers: tuple[BaseTransformer, ArrayLike | str | Callable],
172
    remainder: str | BaseTransformer = "drop",
173
    sparse_threshold: float = 0.3,
174
    n_jobs: int | None = None,
175
    verbose: bool = False,
176
    verbose_feature_names_out: bool = True,
177
    force_int_remainder_cols: bool = True
178
) -> ColumnTransformer
179
```
180
Construct a ColumnTransformer from the given transformers.
181

182
#### make_column_selector { .api }
183
```python
184
from sklearn.compose import make_column_selector
185

186
make_column_selector(
187
    pattern: str | None = None,
188
    dtype_include: type | str | list | None = None,
189
    dtype_exclude: type | str | list | None = None
190
) -> Callable
191
```
192
Create a callable to select columns to be used with ColumnTransformer.
193

194
## Inspection
195

196
### Partial Dependence
197

198
#### partial_dependence { .api }
199
```python
200
from sklearn.inspection import partial_dependence
201

202
partial_dependence(
203
    estimator: BaseEstimator,
204
    X: ArrayLike,
205
    features: int | str | ArrayLike | list,
206
    response_method: str = "auto",
207
    percentiles: tuple[float, float] = (0.05, 0.95),
208
    grid_resolution: int = 100,
209
    method: str = "auto",
210
    kind: str = "average",
211
    subsample: int | float | None = 1000,
212
    n_jobs: int | None = None,
213
    verbose: int = 0,
214
    feature_names: ArrayLike | None = None,
215
    categorical_features: ArrayLike | None = None
216
) -> dict
217
```
218
Partial dependence of features.
219

220
#### permutation_importance { .api }
221
```python
222
from sklearn.inspection import permutation_importance
223

224
permutation_importance(
225
    estimator: BaseEstimator,
226
    X: ArrayLike,
227
    y: ArrayLike,
228
    scoring: str | Callable | list | tuple | dict | None = None,
229
    n_repeats: int = 5,
230
    n_jobs: int | None = None,
231
    random_state: int | RandomState | None = None,
232
    sample_weight: ArrayLike | None = None,
233
    max_samples: int | float = 1.0
234
) -> dict
235
```
236
Permutation importance for feature evaluation.
237

238
### Display Classes
239

240
#### PartialDependenceDisplay { .api }
241
```python
242
from sklearn.inspection import PartialDependenceDisplay
243

244
PartialDependenceDisplay(
245
    pd_results: list[dict],
246
    features: list,
247
    feature_names: ArrayLike | None = None,
248
    target_idx: int | None = None,
249
    deciles: dict | None = None
250
)
251
```
252
Partial Dependence Plot (PDP).
253

254
#### DecisionBoundaryDisplay { .api }
255
```python
256
from sklearn.inspection import DecisionBoundaryDisplay
257

258
DecisionBoundaryDisplay(
259
    xx0: ArrayLike,
260
    xx1: ArrayLike,
261
    response: ArrayLike
262
)
263
```
264
Visualization of decision boundaries of a classifier.
265

266
## Isotonic Regression Utilities
267

268
### Isotonic Functions
269

270
#### check_increasing { .api }
271
```python
272
from sklearn.isotonic import check_increasing
273

274
check_increasing(
275
    x: ArrayLike,
276
    y: ArrayLike
277
) -> bool
278
```
279
Determine whether y is monotonically correlated with x.
280

281
#### isotonic_regression { .api }
282
```python
283
from sklearn.isotonic import isotonic_regression
284

285
isotonic_regression(
286
    y: ArrayLike,
287
    sample_weight: ArrayLike | None = None,
288
    y_min: float | None = None,
289
    y_max: float | None = None,
290
    increasing: bool = True
291
) -> ArrayLike
292
```
293
Solve the isotonic regression model.
294

295
## Neighbors Utilities
296

297
### Neighbor Functions
298

299
#### kneighbors_graph { .api }
300
```python
301
from sklearn.neighbors import kneighbors_graph
302

303
kneighbors_graph(
304
    X: ArrayLike,
305
    n_neighbors: int,
306
    mode: str = "connectivity",
307
    metric: str | Callable = "minkowski",
308
    p: int = 2,
309
    metric_params: dict | None = None,
310
    include_self: bool | str = "auto",
311
    n_jobs: int | None = None
312
) -> ArrayLike
313
```
314
Compute the (weighted) graph of k-Neighbors for points in X.
315

316
#### radius_neighbors_graph { .api }
317
```python
318
from sklearn.neighbors import radius_neighbors_graph
319

320
radius_neighbors_graph(
321
    X: ArrayLike,
322
    radius: float,
323
    mode: str = "connectivity",
324
    metric: str | Callable = "minkowski",
325
    p: int = 2,
326
    metric_params: dict | None = None,
327
    include_self: bool | str = "auto",
328
    n_jobs: int | None = None
329
) -> ArrayLike
330
```
331
Compute the (weighted) graph of Neighbors for points in X.
332

333
#### sort_graph_by_row_values { .api }
334
```python
335
from sklearn.neighbors import sort_graph_by_row_values
336

337
sort_graph_by_row_values(
338
    graph: ArrayLike,
339
    copy: bool = True,
340
    warn_when_not_sorted: bool = True
341
) -> ArrayLike
342
```
343
Sort a sparse graph such that each row has its data sorted by value.
344

345
### Neighbor Data Structures
346

347
#### BallTree { .api }
348
```python
349
from sklearn.neighbors import BallTree
350

351
BallTree(
352
    X: ArrayLike,
353
    leaf_size: int = 40,
354
    metric: str | DistanceMetric = "minkowski",
355
    **kwargs
356
)
357
```
358
BallTree for fast generalized N-point problems.
359

360
#### KDTree { .api }
361
```python
362
from sklearn.neighbors import KDTree
363

364
KDTree(
365
    X: ArrayLike,
366
    leaf_size: int = 40,
367
    metric: str = "minkowski",
368
    **kwargs
369
)
370
```
371
KDTree for fast generalized N-point problems.
372

373
#### KernelDensity { .api }
374
```python
375
from sklearn.neighbors import KernelDensity
376

377
KernelDensity(
378
    bandwidth: float | str = 1.0,
379
    algorithm: str = "auto",
380
    kernel: str = "gaussian",
381
    metric: str = "euclidean",
382
    atol: float = 0,
383
    rtol: float = 0,
384
    breadth_first: bool = True,
385
    leaf_size: int = 40,
386
    metric_params: dict | None = None
387
)
388
```
389
Kernel Density Estimation.
390

391
#### NearestNeighbors { .api }
392
```python
393
from sklearn.neighbors import NearestNeighbors
394

395
NearestNeighbors(
396
    n_neighbors: int = 5,
397
    radius: float = 1.0,
398
    algorithm: str = "auto",
399
    leaf_size: int = 30,
400
    metric: str | Callable = "minkowski",
401
    p: int = 2,
402
    metric_params: dict | None = None,
403
    n_jobs: int | None = None
404
)
405
```
406
Unsupervised learner for implementing neighbor searches.
407

408
#### KNeighborsTransformer { .api }
409
```python
410
from sklearn.neighbors import KNeighborsTransformer
411

412
KNeighborsTransformer(
413
    mode: str = "distance",
414
    n_neighbors: int = 5,
415
    algorithm: str = "auto",
416
    leaf_size: int = 30,
417
    metric: str | Callable = "minkowski",
418
    p: int = 2,
419
    metric_params: dict | None = None,
420
    n_jobs: int | None = None
421
)
422
```
423
Transform X into a (weighted) graph of k nearest neighbors.
424

425
#### RadiusNeighborsTransformer { .api }
426
```python
427
from sklearn.neighbors import RadiusNeighborsTransformer
428

429
RadiusNeighborsTransformer(
430
    mode: str = "distance",
431
    radius: float = 1.0,
432
    algorithm: str = "auto",
433
    leaf_size: int = 30,
434
    metric: str | Callable = "minkowski",
435
    p: int = 2,
436
    metric_params: dict | None = None,
437
    n_jobs: int | None = None
438
)
439
```
440
Transform X into a (weighted) graph of neighbors nearer than a radius.
441

442
#### NeighborhoodComponentsAnalysis { .api }
443
```python
444
from sklearn.neighbors import NeighborhoodComponentsAnalysis
445

446
NeighborhoodComponentsAnalysis(
447
    n_components: int | None = None,
448
    init: str | ArrayLike = "auto",
449
    warm_start: bool = False,
450
    max_iter: int = 50,
451
    tol: float = 1e-05,
452
    callback: Callable | None = None,
453
    verbose: int = 0,
454
    random_state: int | RandomState | None = None
455
)
456
```
457
Neighborhood Components Analysis.
458

459
### Neighbor Constants
460

461
#### VALID_METRICS { .api }
462
```python
463
from sklearn.neighbors import VALID_METRICS
464

465
# Dictionary mapping algorithm names to valid metrics
466
VALID_METRICS: dict[str, list[str]]
467
```
468
Valid metrics for neighbor algorithms.
469

470
#### VALID_METRICS_SPARSE { .api }
471
```python
472
from sklearn.neighbors import VALID_METRICS_SPARSE
473

474
# Dictionary mapping algorithm names to valid metrics for sparse matrices  
475
VALID_METRICS_SPARSE: dict[str, list[str]]
476
```
477
Valid metrics for neighbor algorithms with sparse matrices.
478

479
## Exception Classes
480

481
#### NotFittedError { .api }
482
```python
483
from sklearn.exceptions import NotFittedError
484

485
class NotFittedError(ValueError, AttributeError):
486
    """Exception class to raise if estimator is used before fitting."""
487
    pass
488
```
489
Exception class to raise if estimator is used before fitting.
490

491
#### ConvergenceWarning { .api }
492
```python
493
from sklearn.exceptions import ConvergenceWarning
494

495
class ConvergenceWarning(UserWarning):
496
    """Custom warning to capture convergence problems."""
497
    pass
498
```
499
Custom warning to capture convergence problems.
500

501
#### DataConversionWarning { .api }
502
```python
503
from sklearn.exceptions import DataConversionWarning
504

505
class DataConversionWarning(UserWarning):
506
    """Warning used to notify implicit data conversions happening in the code."""
507
    pass
508
```
509
Warning used to notify implicit data conversions happening in the code.
510

511
#### DataDimensionalityWarning { .api }
512
```python
513
from sklearn.exceptions import DataDimensionalityWarning
514

515
class DataDimensionalityWarning(UserWarning):
516
    """Custom warning to capture data dimensionality problems."""
517
    pass
518
```
519
Custom warning to capture data dimensionality problems.
520

521
#### EfficiencyWarning { .api }
522
```python
523
from sklearn.exceptions import EfficiencyWarning
524

525
class EfficiencyWarning(UserWarning):
526
    """Warning used to notify the user of inefficient computation."""
527
    pass
528
```
529
Warning used to notify the user of inefficient computation.
530

531
#### EstimatorCheckFailedWarning { .api }
532
```python
533
from sklearn.exceptions import EstimatorCheckFailedWarning
534

535
class EstimatorCheckFailedWarning(UserWarning):
536
    """Warning used when an estimator check fails."""
537
    pass
538
```
539
Warning used when an estimator check fails.
540

541
#### FitFailedWarning { .api }
542
```python
543
from sklearn.exceptions import FitFailedWarning
544

545
class FitFailedWarning(RuntimeWarning):
546
    """Warning class used if there is an error while fitting the estimator."""
547
    pass
548
```
549
Warning class used if there is an error while fitting the estimator.
550

551
#### PositiveSpectrumWarning { .api }
552
```python
553
from sklearn.exceptions import PositiveSpectrumWarning
554

555
class PositiveSpectrumWarning(UserWarning):
556
    """Warning raised when the eigenvalues of a PSD matrix have issues."""
557
    pass
558
```
559
Warning raised when the eigenvalues of a PSD matrix have issues.
560

561
#### SkipTestWarning { .api }
562
```python
563
from sklearn.exceptions import SkipTestWarning
564

565
class SkipTestWarning(UserWarning):
566
    """Warning class used to notify the user of a test that was skipped."""
567
    pass
568
```
569
Warning class used to notify the user of a test that was skipped.
570

571
#### UndefinedMetricWarning { .api }
572
```python
573
from sklearn.exceptions import UndefinedMetricWarning
574

575
class UndefinedMetricWarning(UserWarning):
576
    """Warning used when the metric is invalid."""
577
    pass
578
```
579
Warning used when the metric is invalid.
580

581
#### UnsetMetadataPassedError { .api }
582
```python
583
from sklearn.exceptions import UnsetMetadataPassedError
584

585
class UnsetMetadataPassedError(ValueError):
586
    """Exception when metadata is passed which is not explicitly requested."""
587
    pass
588
```
589
Exception when metadata is passed which is not explicitly requested.
590

591
## Frozen Estimators
592

593
#### FrozenEstimator { .api }
594
```python
595
from sklearn.frozen import FrozenEstimator
596

597
FrozenEstimator(
598
    estimator: BaseEstimator
599
)
600
```
601
Wrapper to freeze an estimator and use it as a transformer.
602

603
## Examples
604

605
### Basic Pipeline Example
606

607
```python
608
from sklearn.pipeline import Pipeline, make_pipeline
609
from sklearn.preprocessing import StandardScaler
610
from sklearn.linear_model import LogisticRegression
611
from sklearn.datasets import load_iris
612

613
# Load data
614
X, y = load_iris(return_X_y=True)
615

616
# Method 1: Using Pipeline class
617
pipeline = Pipeline([
618
    ('scaler', StandardScaler()),
619
    ('classifier', LogisticRegression())
620
])
621

622
# Method 2: Using make_pipeline function
623
pipeline = make_pipeline(
624
    StandardScaler(),
625
    LogisticRegression()
626
)
627

628
# Fit and predict
629
pipeline.fit(X, y)
630
predictions = pipeline.predict(X)
631
```
632

633
### Column Transformer Example
634

635
```python
636
from sklearn.compose import ColumnTransformer, make_column_transformer
637
from sklearn.preprocessing import StandardScaler, OneHotEncoder
638
import pandas as pd
639

640
# Example with mixed data types
641
data = pd.DataFrame({
642
    'age': [25, 30, 35],
643
    'income': [50000, 60000, 70000], 
644
    'city': ['NYC', 'LA', 'Chicago'],
645
    'gender': ['M', 'F', 'M']
646
})
647

648
# Method 1: Using ColumnTransformer class
649
preprocessor = ColumnTransformer([
650
    ('num', StandardScaler(), ['age', 'income']),
651
    ('cat', OneHotEncoder(), ['city', 'gender'])
652
])
653

654
# Method 2: Using make_column_transformer function
655
preprocessor = make_column_transformer(
656
    (StandardScaler(), ['age', 'income']),
657
    (OneHotEncoder(), ['city', 'gender'])
658
)
659

660
# Transform data
661
transformed = preprocessor.fit_transform(data)
662
```
663

664
### Feature Union Example
665

666
```python
667
from sklearn.pipeline import FeatureUnion, make_union
668
from sklearn.decomposition import PCA
669
from sklearn.feature_selection import SelectKBest
670

671
# Combine PCA and feature selection
672
feature_union = FeatureUnion([
673
    ('pca', PCA(n_components=2)),
674
    ('select_k_best', SelectKBest(k=2))
675
])
676

677
# Or using make_union
678
feature_union = make_union(
679
    PCA(n_components=2),
680
    SelectKBest(k=2)
681
)
682

683
# Transform features
684
X_combined = feature_union.fit_transform(X, y)
685
```
686

687
### Configuration Example
688

689
```python
690
from sklearn import set_config, get_config, config_context
691
from sklearn.linear_model import LinearRegression
692

693
# Get current config
694
current_config = get_config()
695
print(current_config)
696

697
# Set global configuration
698
set_config(display='diagram', print_changed_only=True)
699

700
# Use configuration context
701
with config_context(assume_finite=True):
702
    # Operations within this block use assume_finite=True
703
    model = LinearRegression()
704
    model.fit(X, y)
705

706
# Configuration reverts to previous state outside the context
707
```
708

709
### Partial Dependence Example
710

711
```python
712
from sklearn.inspection import partial_dependence, PartialDependenceDisplay
713
from sklearn.ensemble import RandomForestRegressor
714
import matplotlib.pyplot as plt
715

716
# Train model
717
model = RandomForestRegressor(n_estimators=100, random_state=42)
718
model.fit(X, y)
719

720
# Compute partial dependence
721
pd_result = partial_dependence(
722
    model, X, features=[0, 1], 
723
    grid_resolution=20
724
)
725

726
# Create display
727
display = PartialDependenceDisplay.from_estimator(
728
    model, X, features=[0, 1]
729
)
730
display.plot()
731
plt.show()
732
```
733

734
### Permutation Importance Example
735

736
```python
737
from sklearn.inspection import permutation_importance
738

739
# Calculate permutation importance
740
result = permutation_importance(
741
    model, X, y, n_repeats=10, random_state=42
742
)
743

744
# Get importance scores
745
importance_scores = result.importances_mean
746
importance_std = result.importances_std
747

748
# Print results
749
for i, (score, std) in enumerate(zip(importance_scores, importance_std)):
750
    print(f"Feature {i}: {score:.3f} +/- {std:.3f}")
751
```

Version

Tile

Files

utilities.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

utilities.mddocs/