Tessl Tile for pypi/scikit-learn@1.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

datasets.md feature-extraction.md index.md metrics.md model-selection.md neighbors.md pipelines.md preprocessing.md supervised-learning.md unsupervised-learning.md utilities.md

unsupervised-learning.mddocs/

0
# Unsupervised Learning
1

2
This document covers all unsupervised learning algorithms in scikit-learn, including clustering, dimensionality reduction, and mixture models.
3

4
## Clustering
5

6
### Core Clustering Algorithms
7

8
#### KMeans { .api }
9
```python
10
from sklearn.cluster import KMeans
11

12
KMeans(
13
    n_clusters: int = 8,
14
    init: str | ArrayLike | Callable = "k-means++",
15
    n_init: int | str = "auto",
16
    max_iter: int = 300,
17
    tol: float = 0.0001,
18
    verbose: int = 0,
19
    random_state: int | RandomState | None = None,
20
    copy_x: bool = True,
21
    algorithm: str = "lloyd"
22
)
23
```
24
K-Means clustering.
25

26
#### MiniBatchKMeans { .api }
27
```python
28
from sklearn.cluster import MiniBatchKMeans
29

30
MiniBatchKMeans(
31
    n_clusters: int = 8,
32
    init: str | ArrayLike | Callable = "k-means++",
33
    max_iter: int = 100,
34
    batch_size: int = 1024,
35
    verbose: int = 0,
36
    compute_labels: bool = True,
37
    random_state: int | RandomState | None = None,
38
    tol: float = 0.0,
39
    max_no_improvement: int = 10,
40
    init_size: int | None = None,
41
    n_init: int | str = 3,
42
    reassignment_ratio: float = 0.01
43
)
44
```
45
Mini-Batch K-Means clustering.
46

47
#### BisectingKMeans { .api }
48
```python
49
from sklearn.cluster import BisectingKMeans
50

51
BisectingKMeans(
52
    n_clusters: int = 8,
53
    init: str | Callable = "random",
54
    n_init: int = 1,
55
    random_state: int | RandomState | None = None,
56
    max_iter: int = 300,
57
    verbose: int = 0,
58
    tol: float = 0.0001,
59
    copy_x: bool = True,
60
    algorithm: str = "lloyd",
61
    bisecting_strategy: str = "biggest_inertia"
62
)
63
```
64
Bisecting K-Means clustering.
65

66
#### DBSCAN { .api }
67
```python
68
from sklearn.cluster import DBSCAN
69

70
DBSCAN(
71
    eps: float = 0.5,
72
    min_samples: int = 5,
73
    metric: str | Callable = "euclidean",
74
    metric_params: dict | None = None,
75
    algorithm: str = "auto",
76
    leaf_size: int = 30,
77
    p: float | None = None,
78
    n_jobs: int | None = None
79
)
80
```
81
Perform DBSCAN clustering from vector array or distance matrix.
82

83
#### HDBSCAN { .api }
84
```python
85
from sklearn.cluster import HDBSCAN
86

87
HDBSCAN(
88
    min_cluster_size: int = 5,
89
    min_samples: int | None = None,
90
    cluster_selection_epsilon: float = 0.0,
91
    max_cluster_size: int | None = None,
92
    metric: str | Callable = "euclidean",
93
    metric_params: dict | None = None,
94
    alpha: float = 1.0,
95
    algorithm: str = "auto",
96
    leaf_size: int = 40,
97
    n_jobs: int | None = None,
98
    cluster_selection_method: str = "eom",
99
    allow_single_cluster: bool = False,
100
    store_centers: str | None = None,
101
    copy: bool = True
102
)
103
```
104
Perform HDBSCAN clustering from vector array or distance matrix.
105

106
#### OPTICS { .api }
107
```python
108
from sklearn.cluster import OPTICS
109

110
OPTICS(
111
    min_samples: int = 5,
112
    max_eps: float = ...,
113
    metric: str | Callable = "minkowski",
114
    p: int = 2,
115
    metric_params: dict | None = None,
116
    cluster_method: str = "xi",
117
    eps: float | None = None,
118
    xi: float = 0.05,
119
    predecessor_correction: bool = True,
120
    min_cluster_size: int | float | None = None,
121
    algorithm: str = "auto",
122
    leaf_size: int = 30,
123
    memory: str | object | None = None,
124
    n_jobs: int | None = None
125
)
126
```
127
Estimate clustering structure from vector array.
128

129
#### MeanShift { .api }
130
```python
131
from sklearn.cluster import MeanShift
132

133
MeanShift(
134
    bandwidth: float | None = None,
135
    seeds: ArrayLike | None = None,
136
    bin_seeding: bool = False,
137
    min_bin_freq: int = 1,
138
    cluster_all: bool = True,
139
    n_jobs: int | None = None,
140
    max_iter: int = 300
141
)
142
```
143
Mean shift clustering using a flat kernel.
144

145
#### AgglomerativeClustering { .api }
146
```python
147
from sklearn.cluster import AgglomerativeClustering
148

149
AgglomerativeClustering(
150
    n_clusters: int | None = 2,
151
    metric: str | Callable | None = None,
152
    memory: str | object | None = None,
153
    connectivity: ArrayLike | Callable | None = None,
154
    compute_full_tree: bool | str = "auto",
155
    linkage: str = "ward",
156
    distance_threshold: float | None = None,
157
    compute_distances: bool = False
158
)
159
```
160
Agglomerative Clustering.
161

162
#### FeatureAgglomeration { .api }
163
```python
164
from sklearn.cluster import FeatureAgglomeration
165

166
FeatureAgglomeration(
167
    n_clusters: int | None = 2,
168
    metric: str | Callable | None = None,
169
    memory: str | object | None = None,
170
    connectivity: ArrayLike | Callable | None = None,
171
    compute_full_tree: bool | str = "auto",
172
    linkage: str = "ward",
173
    pooling_func: Callable = ...,
174
    distance_threshold: float | None = None,
175
    compute_distances: bool = False
176
)
177
```
178
Agglomerate features.
179

180
#### Birch { .api }
181
```python
182
from sklearn.cluster import Birch
183

184
Birch(
185
    n_clusters: int | None = 3,
186
    threshold: float = 0.5,
187
    branching_factor: int = 50,
188
    compute_labels: bool = True,
189
    copy: bool = True
190
)
191
```
192
Implements the BIRCH clustering algorithm.
193

194
#### AffinityPropagation { .api }
195
```python
196
from sklearn.cluster import AffinityPropagation
197

198
AffinityPropagation(
199
    damping: float = 0.5,
200
    max_iter: int = 200,
201
    convergence_iter: int = 15,
202
    copy: bool = True,
203
    preference: ArrayLike | float | None = None,
204
    affinity: str = "euclidean",
205
    verbose: bool = False,
206
    random_state: int | RandomState | None = None
207
)
208
```
209
Perform Affinity Propagation Clustering of data.
210

211
#### SpectralClustering { .api }
212
```python
213
from sklearn.cluster import SpectralClustering
214

215
SpectralClustering(
216
    n_clusters: int = 8,
217
    eigen_solver: str | None = None,
218
    n_components: int | None = None,
219
    random_state: int | RandomState | None = None,
220
    n_init: int = 10,
221
    gamma: float = 1.0,
222
    affinity: str | Callable = "rbf",
223
    n_neighbors: int = 10,
224
    eigen_tol: float | str = "auto",
225
    assign_labels: str = "kmeans",
226
    degree: float = 3,
227
    coef0: float = 1,
228
    kernel_params: dict | None = None,
229
    n_jobs: int | None = None,
230
    verbose: bool = False
231
)
232
```
233
Apply clustering to a projection of the normalized Laplacian.
234

235
#### SpectralBiclustering { .api }
236
```python
237
from sklearn.cluster import SpectralBiclustering
238

239
SpectralBiclustering(
240
    n_clusters: int | tuple = 3,
241
    method: str = "bistochastic",
242
    n_components: int = 6,
243
    n_best: int = 3,
244
    svd_method: str = "randomized",
245
    n_svd_vecs: int | None = None,
246
    mini_batch: bool = False,
247
    init: str | ArrayLike = "k-means++",
248
    n_init: int = 10,
249
    random_state: int | RandomState | None = None
250
)
251
```
252
Spectral biclustering (Kluger, 2003).
253

254
#### SpectralCoclustering { .api }
255
```python
256
from sklearn.cluster import SpectralCoclustering
257

258
SpectralCoclustering(
259
    n_clusters: int = 3,
260
    svd_method: str = "randomized",
261
    n_svd_vecs: int | None = None,
262
    mini_batch: bool = False,
263
    init: str | ArrayLike = "k-means++",
264
    n_init: int = 10,
265
    random_state: int | RandomState | None = None
266
)
267
```
268
Spectral Co-Clustering algorithm (Dhillon, 2001).
269

270
### Clustering Functions
271

272
#### k_means { .api }
273
```python
274
from sklearn.cluster import k_means
275

276
k_means(
277
    X: ArrayLike,
278
    n_clusters: int,
279
    sample_weight: ArrayLike | None = None,
280
    init: str | ArrayLike | Callable = "k-means++",
281
    n_init: int | str = 10,
282
    max_iter: int = 300,
283
    verbose: bool = False,
284
    tol: float = 0.0001,
285
    random_state: int | RandomState | None = None,
286
    copy_x: bool = True,
287
    algorithm: str = "lloyd",
288
    return_n_iter: bool = False
289
) -> tuple[ArrayLike, ArrayLike, float, int] | tuple[ArrayLike, ArrayLike, float]
290
```
291
K-means clustering algorithm.
292

293
#### kmeans_plusplus { .api }
294
```python
295
from sklearn.cluster import kmeans_plusplus
296

297
kmeans_plusplus(
298
    X: ArrayLike,
299
    n_clusters: int,
300
    x_squared_norms: ArrayLike | None = None,
301
    random_state: int | RandomState | None = None,
302
    n_local_trials: int | None = None
303
) -> tuple[ArrayLike, ArrayLike]
304
```
305
Init n_clusters seeds according to k-means++.
306

307
#### dbscan { .api }
308
```python
309
from sklearn.cluster import dbscan
310

311
dbscan(
312
    X: ArrayLike,
313
    eps: float = 0.5,
314
    min_samples: int = 5,
315
    metric: str | Callable = "euclidean",
316
    metric_params: dict | None = None,
317
    algorithm: str = "auto",
318
    leaf_size: int = 30,
319
    p: float | None = None,
320
    sample_weight: ArrayLike | None = None,
321
    n_jobs: int | None = None
322
) -> tuple[ArrayLike, ArrayLike]
323
```
324
Perform DBSCAN clustering from vector array or distance matrix.
325

326
#### affinity_propagation { .api }
327
```python
328
from sklearn.cluster import affinity_propagation
329

330
affinity_propagation(
331
    S: ArrayLike,
332
    preference: ArrayLike | float | None = None,
333
    convergence_iter: int = 15,
334
    max_iter: int = 200,
335
    damping: float = 0.5,
336
    copy: bool = True,
337
    verbose: bool = False,
338
    return_n_iter: bool = False,
339
    random_state: int | RandomState | None = None
340
) -> tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike]
341
```
342
Perform Affinity Propagation Clustering of data.
343

344
#### spectral_clustering { .api }
345
```python
346
from sklearn.cluster import spectral_clustering
347

348
spectral_clustering(
349
    affinity: ArrayLike,
350
    n_clusters: int = 8,
351
    n_components: int | None = None,
352
    eigen_solver: str | None = None,
353
    random_state: int | RandomState | None = None,
354
    n_init: int = 10,
355
    eigen_tol: float | str = "auto",
356
    assign_labels: str = "kmeans",
357
    verbose: bool = False
358
) -> ArrayLike
359
```
360
Apply clustering to a projection of the normalized Laplacian.
361

362
#### mean_shift { .api }
363
```python
364
from sklearn.cluster import mean_shift
365

366
mean_shift(
367
    X: ArrayLike,
368
    bandwidth: float | None = None,
369
    seeds: ArrayLike | None = None,
370
    bin_seeding: bool = False,
371
    min_bin_freq: int = 1,
372
    cluster_all: bool = True,
373
    max_iter: int = 300,
374
    n_jobs: int | None = None
375
) -> tuple[ArrayLike, ArrayLike]
376
```
377
Perform mean shift clustering of data using a flat kernel.
378

379
#### estimate_bandwidth { .api }
380
```python
381
from sklearn.cluster import estimate_bandwidth
382

383
estimate_bandwidth(
384
    X: ArrayLike,
385
    quantile: float = 0.3,
386
    n_samples: int | None = None,
387
    random_state: int | RandomState | None = None,
388
    n_jobs: int | None = None
389
) -> float
390
```
391
Estimate the bandwidth to use with the mean-shift algorithm.
392

393
#### ward_tree { .api }
394
```python
395
from sklearn.cluster import ward_tree
396

397
ward_tree(
398
    X: ArrayLike,
399
    connectivity: ArrayLike | None = None,
400
    n_clusters: int | None = None,
401
    return_distance: bool = False
402
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]
403
```
404
Ward clustering based on a Feature matrix.
405

406
#### linkage_tree { .api }
407
```python
408
from sklearn.cluster import linkage_tree
409

410
linkage_tree(
411
    X: ArrayLike,
412
    connectivity: ArrayLike | None = None,
413
    n_clusters: int | None = None,
414
    linkage: str = "complete",
415
    affinity: str = "euclidean",
416
    return_distance: bool = False
417
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]
418
```
419
Linkage agglomerative clustering based on a Feature matrix.
420

421
#### get_bin_seeds { .api }
422
```python
423
from sklearn.cluster import get_bin_seeds
424

425
get_bin_seeds(
426
    X: ArrayLike,
427
    bin_size: float,
428
    min_bin_freq: int = 1
429
) -> ArrayLike
430
```
431
Find seeds for mean_shift.
432

433
#### cluster_optics_dbscan { .api }
434
```python
435
from sklearn.cluster import cluster_optics_dbscan
436

437
cluster_optics_dbscan(
438
    reachability: ArrayLike,
439
    core_distances: ArrayLike,
440
    ordering: ArrayLike,
441
    eps: float
442
) -> ArrayLike
443
```
444
Performs DBSCAN extraction for an arbitrary epsilon.
445

446
#### cluster_optics_xi { .api }
447
```python
448
from sklearn.cluster import cluster_optics_xi
449

450
cluster_optics_xi(
451
    reachability: ArrayLike,
452
    predecessor: ArrayLike,
453
    ordering: ArrayLike,
454
    min_samples: int,
455
    min_cluster_size: int | float | None = None,
456
    xi: float = 0.05,
457
    predecessor_correction: bool = True
458
) -> tuple[ArrayLike, ArrayLike]
459
```
460
Automatically extract clusters according to the Xi-steep method.
461

462
#### compute_optics_graph { .api }
463
```python
464
from sklearn.cluster import compute_optics_graph
465

466
compute_optics_graph(
467
    X: ArrayLike,
468
    min_samples: int,
469
    max_eps: float,
470
    metric: str | Callable,
471
    p: int,
472
    metric_params: dict | None,
473
    algorithm: str,
474
    leaf_size: int,
475
    n_jobs: int | None
476
) -> ArrayLike
477
```
478
Compute the OPTICS reachability graph.
479

480
## Dimensionality Reduction
481

482
### Principal Component Analysis
483

484
#### PCA { .api }
485
```python
486
from sklearn.decomposition import PCA
487

488
PCA(
489
    n_components: int | float | str | None = None,
490
    copy: bool = True,
491
    whiten: bool = False,
492
    svd_solver: str = "auto",
493
    tol: float = 0.0,
494
    iterated_power: int | str = "auto",
495
    n_oversamples: int = 10,
496
    power_iteration_normalizer: str = "auto",
497
    random_state: int | RandomState | None = None
498
)
499
```
500
Principal component analysis (PCA).
501

502
#### IncrementalPCA { .api }
503
```python
504
from sklearn.decomposition import IncrementalPCA
505

506
IncrementalPCA(
507
    n_components: int | None = None,
508
    whiten: bool = False,
509
    copy: bool = True,
510
    batch_size: int | None = None
511
)
512
```
513
Incremental principal components analysis (IPCA).
514

515
#### KernelPCA { .api }
516
```python
517
from sklearn.decomposition import KernelPCA
518

519
KernelPCA(
520
    n_components: int | None = None,
521
    kernel: str | Callable = "linear",
522
    gamma: float | None = None,
523
    degree: int = 3,
524
    coef0: float = 1,
525
    kernel_params: dict | None = None,
526
    alpha: float = 1.0,
527
    fit_inverse_transform: bool = False,
528
    eigen_solver: str = "auto",
529
    tol: float = 0,
530
    max_iter: int | None = None,
531
    iterated_power: int | str = "auto",
532
    remove_zero_eig: bool = False,
533
    random_state: int | RandomState | None = None,
534
    copy_X: bool = True,
535
    n_jobs: int | None = None
536
)
537
```
538
Kernel Principal component analysis (KPCA).
539

540
#### SparsePCA { .api }
541
```python
542
from sklearn.decomposition import SparsePCA
543

544
SparsePCA(
545
    n_components: int | None = None,
546
    alpha: float = 1,
547
    ridge_alpha: float = 0.01,
548
    max_iter: int = 1000,
549
    tol: float = 1e-08,
550
    method: str = "lars",
551
    n_jobs: int | None = None,
552
    U_init: ArrayLike | None = None,
553
    V_init: ArrayLike | None = None,
554
    verbose: bool | int = False,
555
    random_state: int | RandomState | None = None
556
)
557
```
558
Sparse Principal Components Analysis (SparsePCA).
559

560
#### MiniBatchSparsePCA { .api }
561
```python
562
from sklearn.decomposition import MiniBatchSparsePCA
563

564
MiniBatchSparsePCA(
565
    n_components: int | None = None,
566
    alpha: float = 1,
567
    ridge_alpha: float = 0.01,
568
    n_iter: int = 100,
569
    callback: Callable | None = None,
570
    batch_size: int = 3,
571
    verbose: bool | int = False,
572
    shuffle: bool = True,
573
    n_jobs: int | None = None,
574
    method: str = "lars",
575
    random_state: int | RandomState | None = None
576
)
577
```
578
Mini-batch Sparse Principal Components Analysis.
579

580
#### TruncatedSVD { .api }
581
```python
582
from sklearn.decomposition import TruncatedSVD
583

584
TruncatedSVD(
585
    n_components: int = 2,
586
    algorithm: str = "randomized",
587
    n_iter: int = 5,
588
    n_oversamples: int = 10,
589
    power_iteration_normalizer: str = "auto",
590
    random_state: int | RandomState | None = None,
591
    tol: float = 0.0
592
)
593
```
594
Dimensionality reduction using truncated SVD (aka LSA).
595

596
### Independent Component Analysis
597

598
#### FastICA { .api }
599
```python
600
from sklearn.decomposition import FastICA
601

602
FastICA(
603
    n_components: int | None = None,
604
    algorithm: str = "parallel",
605
    whiten: str | bool = "unit-variance",
606
    fun: str | Callable = "logcosh",
607
    fun_args: dict | None = None,
608
    max_iter: int = 200,
609
    tol: float = 0.0001,
610
    w_init: ArrayLike | None = None,
611
    whiten_solver: str = "svd",
612
    random_state: int | RandomState | None = None
613
)
614
```
615
FastICA: a fast algorithm for Independent Component Analysis.
616

617
### Factor Analysis
618

619
#### FactorAnalysis { .api }
620
```python
621
from sklearn.decomposition import FactorAnalysis
622

623
FactorAnalysis(
624
    n_components: int | None = None,
625
    tol: float = 0.01,
626
    copy: bool = True,
627
    max_iter: int = 1000,
628
    noise_variance_init: ArrayLike | None = None,
629
    svd_method: str = "randomized",
630
    iterated_power: int = 3,
631
    rotation: str | None = None,
632
    random_state: int | RandomState | None = None
633
)
634
```
635
Factor Analysis (FA).
636

637
### Dictionary Learning
638

639
#### DictionaryLearning { .api }
640
```python
641
from sklearn.decomposition import DictionaryLearning
642

643
DictionaryLearning(
644
    n_components: int | None = None,
645
    alpha: float = 1,
646
    max_iter: int = 1000,
647
    tol: float = 1e-08,
648
    fit_algorithm: str = "lars",
649
    transform_algorithm: str = "omp",
650
    transform_n_nonzero_coefs: int | None = None,
651
    transform_alpha: float | None = None,
652
    n_jobs: int | None = None,
653
    code_init: ArrayLike | None = None,
654
    dict_init: ArrayLike | None = None,
655
    verbose: bool = False,
656
    split_sign: bool = False,
657
    random_state: int | RandomState | None = None,
658
    positive_code: bool = False,
659
    positive_dict: bool = False,
660
    transform_max_iter: int = 1000
661
)
662
```
663
Dictionary learning.
664

665
#### MiniBatchDictionaryLearning { .api }
666
```python
667
from sklearn.decomposition import MiniBatchDictionaryLearning
668

669
MiniBatchDictionaryLearning(
670
    n_components: int | None = None,
671
    alpha: float = 1,
672
    max_iter: int = 1000,
673
    fit_algorithm: str = "lars",
674
    n_jobs: int | None = None,
675
    batch_size: int = 256,
676
    shuffle: bool = True,
677
    dict_init: ArrayLike | None = None,
678
    transform_algorithm: str = "omp",
679
    transform_n_nonzero_coefs: int | None = None,
680
    transform_alpha: float | None = None,
681
    verbose: bool = False,
682
    split_sign: bool = False,
683
    random_state: int | RandomState | None = None,
684
    positive_code: bool = False,
685
    positive_dict: bool = False,
686
    transform_max_iter: int = 1000
687
)
688
```
689
Mini-batch dictionary learning.
690

691
#### SparseCoder { .api }
692
```python
693
from sklearn.decomposition import SparseCoder
694

695
SparseCoder(
696
    dictionary: ArrayLike,
697
    transform_algorithm: str = "omp",
698
    transform_n_nonzero_coefs: int | None = None,
699
    transform_alpha: float | None = None,
700
    split_sign: bool = False,
701
    n_jobs: int | None = None,
702
    positive_code: bool = False,
703
    transform_max_iter: int = 1000
704
)
705
```
706
Sparse coding.
707

708
### Non-negative Matrix Factorization
709

710
#### NMF { .api }
711
```python
712
from sklearn.decomposition import NMF
713

714
NMF(
715
    n_components: int | None = None,
716
    init: str | ArrayLike | None = None,
717
    solver: str = "cd",
718
    beta_loss: float | str = "frobenius",
719
    tol: float = 0.0001,
720
    max_iter: int = 200,
721
    random_state: int | RandomState | None = None,
722
    alpha_W: float = 0.0,
723
    alpha_H: float | str = "same",
724
    l1_ratio: float = 0.0,
725
    verbose: int = 0,
726
    shuffle: bool = False
727
)
728
```
729
Non-negative Matrix Factorization (NMF).
730

731
#### MiniBatchNMF { .api }
732
```python
733
from sklearn.decomposition import MiniBatchNMF
734

735
MiniBatchNMF(
736
    n_components: int | None = None,
737
    init: str | ArrayLike | None = None,
738
    batch_size: int = 1024,
739
    beta_loss: float | str = "frobenius",
740
    tol: float = 0.0001,
741
    max_no_improvement: int = 10,
742
    max_iter: int = 200,
743
    alpha_W: float = 0.0,
744
    alpha_H: float | str = "same",
745
    l1_ratio: float = 0.0,
746
    forget_factor: float = 0.7,
747
    fresh_restarts: bool = False,
748
    fresh_restarts_max_iter: int = 30,
749
    transform_max_iter: int | None = None,
750
    random_state: int | RandomState | None = None,
751
    verbose: int = 0
752
)
753
```
754
Mini-Batch Non-Negative Matrix Factorization (NMF).
755

756
### Latent Dirichlet Allocation
757

758
#### LatentDirichletAllocation { .api }
759
```python
760
from sklearn.decomposition import LatentDirichletAllocation
761

762
LatentDirichletAllocation(
763
    n_components: int = 10,
764
    doc_topic_prior: float | None = None,
765
    topic_word_prior: float | None = None,
766
    learning_method: str = "batch",
767
    learning_decay: float = 0.7,
768
    learning_offset: float = 10.0,
769
    max_iter: int = 10,
770
    batch_size: int = 128,
771
    evaluate_every: int = 0,
772
    total_samples: int = 1000000.0,
773
    perp_tol: float = 0.1,
774
    mean_change_tol: float = 0.001,
775
    max_doc_update_iter: int = 100,
776
    n_jobs: int | None = None,
777
    verbose: int = 0,
778
    random_state: int | RandomState | None = None
779
)
780
```
781
Latent Dirichlet Allocation with online variational Bayes algorithm.
782

783
### Decomposition Functions
784

785
#### randomized_svd { .api }
786
```python
787
from sklearn.decomposition import randomized_svd
788

789
randomized_svd(
790
    M: ArrayLike,
791
    n_components: int,
792
    n_oversamples: int = 10,
793
    n_iter: int | str = "auto",
794
    power_iteration_normalizer: str = "auto",
795
    transpose: bool | str = "auto",
796
    flip_sign: bool = True,
797
    random_state: int | RandomState | None = None,
798
    svd_lapack_driver: str = "gesdd"
799
) -> tuple[ArrayLike, ArrayLike, ArrayLike]
800
```
801
Compute a truncated randomized SVD.
802

803
#### fastica { .api }
804
```python
805
from sklearn.decomposition import fastica
806

807
fastica(
808
    X: ArrayLike,
809
    n_components: int | None = None,
810
    algorithm: str = "parallel",
811
    whiten: str | bool = "unit-variance",
812
    fun: str | Callable = "logcosh",
813
    fun_args: dict | None = None,
814
    max_iter: int = 200,
815
    tol: float = 0.0001,
816
    w_init: ArrayLike | None = None,
817
    whiten_solver: str = "svd",
818
    random_state: int | RandomState | None = None,
819
    return_X_mean: bool = False,
820
    compute_sources: bool = True,
821
    return_n_iter: bool = False
822
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike, int]
823
```
824
Perform Fast Independent Component Analysis.
825

826
#### dict_learning { .api }
827
```python
828
from sklearn.decomposition import dict_learning
829

830
dict_learning(
831
    X: ArrayLike,
832
    n_components: int,
833
    alpha: float,
834
    max_iter: int = 100,
835
    tol: float = 1e-08,
836
    method: str = "lars",
837
    n_jobs: int | None = None,
838
    dict_init: ArrayLike | None = None,
839
    code_init: ArrayLike | None = None,
840
    callback: Callable | None = None,
841
    verbose: bool = False,
842
    random_state: int | RandomState | None = None,
843
    return_n_iter: bool = False,
844
    positive_dict: bool = False,
845
    positive_code: bool = False,
846
    method_max_iter: int = 1000
847
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int]
848
```
849
Solve a dictionary learning matrix factorization problem.
850

851
#### dict_learning_online { .api }
852
```python
853
from sklearn.decomposition import dict_learning_online
854

855
dict_learning_online(
856
    X: ArrayLike,
857
    n_components: int = 2,
858
    alpha: float = 1,
859
    max_iter: int = 100,
860
    return_code: bool = True,
861
    dict_init: ArrayLike | None = None,
862
    callback: Callable | None = None,
863
    batch_size: int = 256,
864
    verbose: bool = False,
865
    shuffle: bool = True,
866
    n_jobs: int | None = None,
867
    method: str = "lars",
868
    iter_offset: int = 0,
869
    random_state: int | RandomState | None = None,
870
    return_inner_stats: bool = False,
871
    inner_stats: tuple | None = None,
872
    return_n_iter: bool = False,
873
    positive_dict: bool = False,
874
    positive_code: bool = False,
875
    method_max_iter: int = 1000
876
) -> ArrayLike | tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, tuple] | tuple[ArrayLike, ArrayLike, tuple] | tuple[ArrayLike, int] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, tuple, int] | tuple[ArrayLike, ArrayLike, tuple, int]
877
```
878
Solve a dictionary learning matrix factorization problem online.
879

880
#### sparse_encode { .api }
881
```python
882
from sklearn.decomposition import sparse_encode
883

884
sparse_encode(
885
    X: ArrayLike,
886
    dictionary: ArrayLike,
887
    gram: ArrayLike | None = None,
888
    cov: ArrayLike | None = None,
889
    algorithm: str = "lasso_lars",
890
    n_nonzero_coefs: int | None = None,
891
    alpha: float | None = None,
892
    copy_cov: bool = True,
893
    init: ArrayLike | None = None,
894
    max_iter: int = 1000,
895
    n_jobs: int | None = None,
896
    check_input: bool = True,
897
    verbose: int = 0,
898
    positive: bool = False
899
) -> ArrayLike
900
```
901
Sparse coding.
902

903
#### non_negative_factorization { .api }
904
```python
905
from sklearn.decomposition import non_negative_factorization
906

907
non_negative_factorization(
908
    X: ArrayLike,
909
    W: ArrayLike | None = None,
910
    H: ArrayLike | None = None,
911
    n_components: int | None = None,
912
    init: str | ArrayLike | None = None,
913
    update_H: bool = True,
914
    solver: str = "cd",
915
    beta_loss: float | str = "frobenius",
916
    tol: float = 0.0001,
917
    max_iter: int = 200,
918
    alpha_W: float = 0.0,
919
    alpha_H: float | str = "same",
920
    l1_ratio: float = 0.0,
921
    regularization: str | None = None,
922
    random_state: int | RandomState | None = None,
923
    verbose: int = 0,
924
    shuffle: bool = False
925
) -> tuple[ArrayLike, ArrayLike, int]
926
```
927
Compute Non-negative Matrix Factorization (NMF).
928

929
## Manifold Learning
930

931
#### Isomap { .api }
932
```python
933
from sklearn.manifold import Isomap
934

935
Isomap(
936
    n_neighbors: int = 5,
937
    radius: float | None = None,
938
    n_components: int = 2,
939
    eigen_solver: str = "auto",
940
    tol: float = 0,
941
    max_iter: int | None = None,
942
    path_method: str = "auto",
943
    neighbors_algorithm: str = "auto",
944
    n_jobs: int | None = None,
945
    metric: str | Callable = "minkowski",
946
    p: int = 2,
947
    metric_params: dict | None = None
948
)
949
```
950
Isomap Embedding.
951

952
#### LocallyLinearEmbedding { .api }
953
```python
954
from sklearn.manifold import LocallyLinearEmbedding
955

956
LocallyLinearEmbedding(
957
    n_neighbors: int = 5,
958
    n_components: int = 2,
959
    reg: float = 0.001,
960
    eigen_solver: str = "auto",
961
    tol: float = 1e-06,
962
    max_iter: int = 100,
963
    method: str = "standard",
964
    hessian_tol: float = 0.0001,
965
    modified_tol: float = 1e-12,
966
    neighbors_algorithm: str = "auto",
967
    random_state: int | RandomState | None = None,
968
    n_jobs: int | None = None
969
)
970
```
971
Locally Linear Embedding.
972

973
#### MDS { .api }
974
```python
975
from sklearn.manifold import MDS
976

977
MDS(
978
    n_components: int = 2,
979
    metric: bool = True,
980
    n_init: int = 4,
981
    max_iter: int = 300,
982
    verbose: int = 0,
983
    eps: float = 0.001,
984
    n_jobs: int | None = None,
985
    random_state: int | RandomState | None = None,
986
    dissimilarity: str = "euclidean",
987
    normalized_stress: str | bool = "auto"
988
)
989
```
990
Multidimensional scaling.
991

992
#### SpectralEmbedding { .api }
993
```python
994
from sklearn.manifold import SpectralEmbedding
995

996
SpectralEmbedding(
997
    n_components: int = 2,
998
    affinity: str | Callable = "nearest_neighbors",
999
    gamma: float | None = None,
1000
    random_state: int | RandomState | None = None,
1001
    eigen_solver: str | None = None,
1002
    n_neighbors: int | None = None,
1003
    n_jobs: int | None = None
1004
)
1005
```
1006
Spectral embedding for non-linear dimensionality reduction.
1007

1008
#### TSNE { .api }
1009
```python
1010
from sklearn.manifold import TSNE
1011

1012
TSNE(
1013
    n_components: int = 2,
1014
    perplexity: float = 30.0,
1015
    early_exaggeration: float = 12.0,
1016
    learning_rate: float | str = "warn",
1017
    n_iter: int = 1000,
1018
    n_iter_without_progress: int = 300,
1019
    min_grad_norm: float = 1e-07,
1020
    metric: str | Callable = "euclidean",
1021
    metric_params: dict | None = None,
1022
    init: str | ArrayLike = "warn",
1023
    verbose: int = 0,
1024
    random_state: int | RandomState | None = None,
1025
    method: str = "barnes_hut",
1026
    angle: float = 0.5,
1027
    n_jobs: int | None = None,
1028
    square_distances: str | bool = "deprecated"
1029
)
1030
```
1031
t-distributed Stochastic Neighbor Embedding.
1032

1033
### Manifold Learning Functions
1034

1035
#### locally_linear_embedding { .api }
1036
```python
1037
from sklearn.manifold import locally_linear_embedding
1038

1039
locally_linear_embedding(
1040
    X: ArrayLike,
1041
    n_neighbors: int,
1042
    n_components: int,
1043
    reg: float = 0.001,
1044
    eigen_solver: str = "auto",
1045
    tol: float = 1e-06,
1046
    max_iter: int = 100,
1047
    method: str = "standard",
1048
    hessian_tol: float = 0.0001,
1049
    modified_tol: float = 1e-12,
1050
    random_state: int | RandomState | None = None,
1051
    n_jobs: int | None = None
1052
) -> tuple[ArrayLike, float]
1053
```
1054
Perform a Locally Linear Embedding analysis on the data.
1055

1056
#### spectral_embedding { .api }
1057
```python
1058
from sklearn.manifold import spectral_embedding
1059

1060
spectral_embedding(
1061
    adjacency: ArrayLike,
1062
    n_components: int = 8,
1063
    eigen_solver: str | None = None,
1064
    random_state: int | RandomState | None = None,
1065
    eigen_tol: float | str = "auto",
1066
    norm_laplacian: bool = True,
1067
    drop_first: bool = True
1068
) -> ArrayLike
1069
```
1070
Project the sample on the first eigenvectors of the graph Laplacian.
1071

1072
#### smacof { .api }
1073
```python
1074
from sklearn.manifold import smacof
1075

1076
smacof(
1077
    dissimilarities: ArrayLike,
1078
    metric: bool = True,
1079
    n_components: int = 2,
1080
    init: ArrayLike | None = None,
1081
    n_init: int = 8,
1082
    n_jobs: int | None = None,
1083
    max_iter: int = 300,
1084
    verbose: int = 0,
1085
    eps: float = 0.001,
1086
    random_state: int | RandomState | None = None,
1087
    return_n_iter: bool = False,
1088
    normalized_stress: str | bool = "auto"
1089
) -> tuple[ArrayLike, float, int] | tuple[ArrayLike, float]
1090
```
1091
Compute multidimensional scaling using the SMACOF algorithm.
1092

1093
#### trustworthiness { .api }
1094
```python
1095
from sklearn.manifold import trustworthiness
1096

1097
trustworthiness(
1098
    X: ArrayLike,
1099
    X_embedded: ArrayLike,
1100
    n_neighbors: int = 5,
1101
    metric: str | Callable = "euclidean"
1102
) -> float
1103
```
1104
Indicate to what extent the local structure is retained.
1105

1106
## Mixture Models
1107

1108
#### GaussianMixture { .api }
1109
```python
1110
from sklearn.mixture import GaussianMixture
1111

1112
GaussianMixture(
1113
    n_components: int = 1,
1114
    covariance_type: str = "full",
1115
    tol: float = 0.001,
1116
    reg_covar: float = 1e-06,
1117
    max_iter: int = 100,
1118
    n_init: int = 1,
1119
    init_params: str = "kmeans",
1120
    weights_init: ArrayLike | None = None,
1121
    means_init: ArrayLike | None = None,
1122
    precisions_init: ArrayLike | None = None,
1123
    random_state: int | RandomState | None = None,
1124
    warm_start: bool = False,
1125
    verbose: int = 0,
1126
    verbose_interval: int = 10
1127
)
1128
```
1129
Gaussian Mixture Model.
1130

1131
#### BayesianGaussianMixture { .api }
1132
```python
1133
from sklearn.mixture import BayesianGaussianMixture
1134

1135
BayesianGaussianMixture(
1136
    n_components: int = 1,
1137
    covariance_type: str = "full",
1138
    tol: float = 0.001,
1139
    reg_covar: float = 1e-06,
1140
    max_iter: int = 100,
1141
    n_init: int = 1,
1142
    init_params: str = "kmeans",
1143
    weight_concentration_prior_type: str = "dirichlet_process",
1144
    weight_concentration_prior: float | None = None,
1145
    mean_precision_prior: float | None = None,
1146
    mean_prior: ArrayLike | None = None,
1147
    degrees_of_freedom_prior: float | None = None,
1148
    covariance_prior: float | ArrayLike | None = None,
1149
    random_state: int | RandomState | None = None,
1150
    warm_start: bool = False,
1151
    verbose: int = 0,
1152
    verbose_interval: int = 10
1153
)
1154
```
1155
Variational Bayesian estimation of a Gaussian mixture.
1156

1157
## Covariance Estimation
1158

1159
#### EmpiricalCovariance { .api }
1160
```python
1161
from sklearn.covariance import EmpiricalCovariance
1162

1163
EmpiricalCovariance(
1164
    store_precision: bool = True,
1165
    assume_centered: bool = False
1166
)
1167
```
1168
Maximum likelihood covariance estimator.
1169

1170
#### ShrunkCovariance { .api }
1171
```python
1172
from sklearn.covariance import ShrunkCovariance
1173

1174
ShrunkCovariance(
1175
    store_precision: bool = True,
1176
    assume_centered: bool = False,
1177
    shrinkage: float = 0.1
1178
)
1179
```
1180
Covariance estimator with shrinkage.
1181

1182
#### LedoitWolf { .api }
1183
```python
1184
from sklearn.covariance import LedoitWolf
1185

1186
LedoitWolf(
1187
    store_precision: bool = True,
1188
    assume_centered: bool = False,
1189
    block_size: int = 1000
1190
)
1191
```
1192
LedoitWolf Estimator.
1193

1194
#### OAS { .api }
1195
```python
1196
from sklearn.covariance import OAS
1197

1198
OAS(
1199
    store_precision: bool = True,
1200
    assume_centered: bool = False
1201
)
1202
```
1203
Oracle Approximating Shrinkage Estimator.
1204

1205
#### MinCovDet { .api }
1206
```python
1207
from sklearn.covariance import MinCovDet
1208

1209
MinCovDet(
1210
    store_precision: bool = True,
1211
    assume_centered: bool = False,
1212
    support_fraction: float | None = None,
1213
    random_state: int | RandomState | None = None
1214
)
1215
```
1216
Minimum Covariance Determinant (Robust covariance estimation).
1217

1218
#### GraphicalLasso { .api }
1219
```python
1220
from sklearn.covariance import GraphicalLasso
1221

1222
GraphicalLasso(
1223
    alpha: float = 0.01,
1224
    mode: str = "cd",
1225
    tol: float = 0.0001,
1226
    enet_tol: float = 0.0001,
1227
    max_iter: int = 100,
1228
    verbose: bool = False,
1229
    assume_centered: bool = False
1230
)
1231
```
1232
Sparse inverse covariance estimation with an l1-penalized estimator.
1233

1234
#### GraphicalLassoCV { .api }
1235
```python
1236
from sklearn.covariance import GraphicalLassoCV
1237

1238
GraphicalLassoCV(
1239
    alphas: int | ArrayLike = 4,
1240
    n_refinements: int = 4,
1241
    cv: int | BaseCrossValidator | Iterable | None = None,
1242
    tol: float = 0.0001,
1243
    enet_tol: float = 0.0001,
1244
    max_iter: int = 100,
1245
    mode: str = "cd",
1246
    n_jobs: int | None = None,
1247
    verbose: bool = False,
1248
    assume_centered: bool = False
1249
)
1250
```
1251
Sparse inverse covariance w/ cross-validated choice of the l1 penalty.
1252

1253
#### EllipticEnvelope { .api }
1254
```python
1255
from sklearn.covariance import EllipticEnvelope
1256

1257
EllipticEnvelope(
1258
    store_precision: bool = True,
1259
    assume_centered: bool = False,
1260
    support_fraction: float | None = None,
1261
    contamination: float = 0.1,
1262
    random_state: int | RandomState | None = None
1263
)
1264
```
1265
An object for detecting outliers in a Gaussian distributed dataset.
1266

1267
### Covariance Functions
1268

1269
#### empirical_covariance { .api }
1270
```python
1271
from sklearn.covariance import empirical_covariance
1272

1273
empirical_covariance(
1274
    X: ArrayLike,
1275
    assume_centered: bool = False
1276
) -> ArrayLike
1277
```
1278
Compute the Maximum likelihood covariance estimator.
1279

1280
#### shrunk_covariance { .api }
1281
```python
1282
from sklearn.covariance import shrunk_covariance
1283

1284
shrunk_covariance(
1285
    emp_cov: ArrayLike,
1286
    shrinkage: float = 0.1
1287
) -> ArrayLike
1288
```
1289
Calculate a covariance matrix shrunk on the diagonal.
1290

1291
#### ledoit_wolf { .api }
1292
```python
1293
from sklearn.covariance import ledoit_wolf
1294

1295
ledoit_wolf(
1296
    X: ArrayLike,
1297
    assume_centered: bool = False,
1298
    block_size: int = 1000
1299
) -> tuple[ArrayLike, float]
1300
```
1301
Estimate covariance with the Ledoit-Wolf estimator.
1302

1303
#### ledoit_wolf_shrinkage { .api }
1304
```python
1305
from sklearn.covariance import ledoit_wolf_shrinkage
1306

1307
ledoit_wolf_shrinkage(
1308
    X: ArrayLike,
1309
    assume_centered: bool = False,
1310
    block_size: int = 1000
1311
) -> float
1312
```
1313
Calculate the Ledoit-Wolf shrinkage coefficient.
1314

1315
#### oas { .api }
1316
```python
1317
from sklearn.covariance import oas
1318

1319
oas(
1320
    X: ArrayLike,
1321
    assume_centered: bool = False
1322
) -> tuple[ArrayLike, float]
1323
```
1324
Estimate covariance with the Oracle Approximating Shrinkage algorithm.
1325

1326
#### fast_mcd { .api }
1327
```python
1328
from sklearn.covariance import fast_mcd
1329

1330
fast_mcd(
1331
    X: ArrayLike,
1332
    support_fraction: float | None = None,
1333
    cov_computation_method: Callable = ...,
1334
    random_state: int | RandomState | None = None
1335
) -> tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike]
1336
```
1337
Estimates the Minimum Covariance Determinant matrix.
1338

1339
#### graphical_lasso { .api }
1340
```python
1341
from sklearn.covariance import graphical_lasso
1342

1343
graphical_lasso(
1344
    emp_cov: ArrayLike,
1345
    alpha: float,
1346
    cov_init: ArrayLike | None = None,
1347
    mode: str = "cd",
1348
    tol: float = 0.0001,
1349
    enet_tol: float = 0.0001,
1350
    max_iter: int = 100,
1351
    verbose: bool = False,
1352
    return_costs: bool = False,
1353
    eps: float = ...,
1354
    return_n_iter: bool = False
1355
) -> tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, list] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, list, int]
1356
```
1357
L1-penalized covariance estimator.
1358

1359
#### log_likelihood { .api }
1360
```python
1361
from sklearn.covariance import log_likelihood
1362

1363
log_likelihood(
1364
    emp_cov: ArrayLike,
1365
    precision: ArrayLike
1366
) -> float
1367
```
1368
Compute the sample mean of the log_likelihood under a covariance model.
1369

1370
## Cross Decomposition
1371

1372
#### CCA { .api }
1373
```python
1374
from sklearn.cross_decomposition import CCA
1375

1376
CCA(
1377
    n_components: int = 2,
1378
    scale: bool = True,
1379
    max_iter: int = 500,
1380
    tol: float = 1e-06,
1381
    copy: bool = True
1382
)
1383
```
1384
Canonical Correlation Analysis.
1385

1386
#### PLSCanonical { .api }
1387
```python
1388
from sklearn.cross_decomposition import PLSCanonical
1389

1390
PLSCanonical(
1391
    n_components: int = 2,
1392
    scale: bool = True,
1393
    algorithm: str = "nipals",
1394
    max_iter: int = 500,
1395
    tol: float = 1e-06,
1396
    copy: bool = True
1397
)
1398
```
1399
Partial Least Squares transformer and regressor.
1400

1401
#### PLSRegression { .api }
1402
```python
1403
from sklearn.cross_decomposition import PLSRegression
1404

1405
PLSRegression(
1406
    n_components: int = 2,
1407
    scale: bool = True,
1408
    max_iter: int = 500,
1409
    tol: float = 1e-06,
1410
    copy: bool = True
1411
)
1412
```
1413
PLS regression.
1414

1415
#### PLSSVD { .api }
1416
```python
1417
from sklearn.cross_decomposition import PLSSVD
1418

1419
PLSSVD(
1420
    n_components: int = 2,
1421
    scale: bool = True,
1422
    copy: bool = True
1423
)
1424
```
1425
Partial Least Square SVD.
1426

1427
## Outlier Detection
1428

1429
Outlier detection algorithms are also available in the ensemble module:
1430

1431
#### LocalOutlierFactor { .api }
1432
```python
1433
from sklearn.neighbors import LocalOutlierFactor
1434

1435
LocalOutlierFactor(
1436
    n_neighbors: int = 20,
1437
    algorithm: str = "auto",
1438
    leaf_size: int = 30,
1439
    metric: str | Callable = "minkowski",
1440
    p: int = 2,
1441
    metric_params: dict | None = None,
1442
    contamination: float | str = "auto",
1443
    novelty: bool = False,
1444
    n_jobs: int | None = None
1445
)
1446
```
1447
Unsupervised Outlier Detection using Local Outlier Factor (LOF).
1448

1449
Note: Additional outlier detection methods are available in:
1450
- `sklearn.ensemble.IsolationForest` - Isolation Forest Algorithm  
1451
- `sklearn.svm.OneClassSVM` - One-Class Support Vector Machine
1452
- `sklearn.covariance.EllipticEnvelope` - Outlier detection for Gaussian data

Version

Tile

Files

unsupervised-learning.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

unsupervised-learning.mddocs/