CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-scikit-learn

A comprehensive machine learning library providing supervised and unsupervised learning algorithms with consistent APIs and extensive tools for data preprocessing, model evaluation, and deployment.

87

0.98x
Overview
Eval results
Files

unsupervised-learning.mddocs/

Unsupervised Learning

This document covers all unsupervised learning algorithms in scikit-learn, including clustering, dimensionality reduction, and mixture models.

Clustering

Core Clustering Algorithms

KMeans { .api }

from sklearn.cluster import KMeans

KMeans(
    n_clusters: int = 8,
    init: str | ArrayLike | Callable = "k-means++",
    n_init: int | str = "auto",
    max_iter: int = 300,
    tol: float = 0.0001,
    verbose: int = 0,
    random_state: int | RandomState | None = None,
    copy_x: bool = True,
    algorithm: str = "lloyd"
)

K-Means clustering.

MiniBatchKMeans { .api }

from sklearn.cluster import MiniBatchKMeans

MiniBatchKMeans(
    n_clusters: int = 8,
    init: str | ArrayLike | Callable = "k-means++",
    max_iter: int = 100,
    batch_size: int = 1024,
    verbose: int = 0,
    compute_labels: bool = True,
    random_state: int | RandomState | None = None,
    tol: float = 0.0,
    max_no_improvement: int = 10,
    init_size: int | None = None,
    n_init: int | str = 3,
    reassignment_ratio: float = 0.01
)

Mini-Batch K-Means clustering.

BisectingKMeans { .api }

from sklearn.cluster import BisectingKMeans

BisectingKMeans(
    n_clusters: int = 8,
    init: str | Callable = "random",
    n_init: int = 1,
    random_state: int | RandomState | None = None,
    max_iter: int = 300,
    verbose: int = 0,
    tol: float = 0.0001,
    copy_x: bool = True,
    algorithm: str = "lloyd",
    bisecting_strategy: str = "biggest_inertia"
)

Bisecting K-Means clustering.

DBSCAN { .api }

from sklearn.cluster import DBSCAN

DBSCAN(
    eps: float = 0.5,
    min_samples: int = 5,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    p: float | None = None,
    n_jobs: int | None = None
)

Perform DBSCAN clustering from vector array or distance matrix.

HDBSCAN { .api }

from sklearn.cluster import HDBSCAN

HDBSCAN(
    min_cluster_size: int = 5,
    min_samples: int | None = None,
    cluster_selection_epsilon: float = 0.0,
    max_cluster_size: int | None = None,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    alpha: float = 1.0,
    algorithm: str = "auto",
    leaf_size: int = 40,
    n_jobs: int | None = None,
    cluster_selection_method: str = "eom",
    allow_single_cluster: bool = False,
    store_centers: str | None = None,
    copy: bool = True
)

Perform HDBSCAN clustering from vector array or distance matrix.

OPTICS { .api }

from sklearn.cluster import OPTICS

OPTICS(
    min_samples: int = 5,
    max_eps: float = ...,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None,
    cluster_method: str = "xi",
    eps: float | None = None,
    xi: float = 0.05,
    predecessor_correction: bool = True,
    min_cluster_size: int | float | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    memory: str | object | None = None,
    n_jobs: int | None = None
)

Estimate clustering structure from vector array.

MeanShift { .api }

from sklearn.cluster import MeanShift

MeanShift(
    bandwidth: float | None = None,
    seeds: ArrayLike | None = None,
    bin_seeding: bool = False,
    min_bin_freq: int = 1,
    cluster_all: bool = True,
    n_jobs: int | None = None,
    max_iter: int = 300
)

Mean shift clustering using a flat kernel.

AgglomerativeClustering { .api }

from sklearn.cluster import AgglomerativeClustering

AgglomerativeClustering(
    n_clusters: int | None = 2,
    metric: str | Callable | None = None,
    memory: str | object | None = None,
    connectivity: ArrayLike | Callable | None = None,
    compute_full_tree: bool | str = "auto",
    linkage: str = "ward",
    distance_threshold: float | None = None,
    compute_distances: bool = False
)

Agglomerative Clustering.

FeatureAgglomeration { .api }

from sklearn.cluster import FeatureAgglomeration

FeatureAgglomeration(
    n_clusters: int | None = 2,
    metric: str | Callable | None = None,
    memory: str | object | None = None,
    connectivity: ArrayLike | Callable | None = None,
    compute_full_tree: bool | str = "auto",
    linkage: str = "ward",
    pooling_func: Callable = ...,
    distance_threshold: float | None = None,
    compute_distances: bool = False
)

Agglomerate features.

Birch { .api }

from sklearn.cluster import Birch

Birch(
    n_clusters: int | None = 3,
    threshold: float = 0.5,
    branching_factor: int = 50,
    compute_labels: bool = True,
    copy: bool = True
)

Implements the BIRCH clustering algorithm.

AffinityPropagation { .api }

from sklearn.cluster import AffinityPropagation

AffinityPropagation(
    damping: float = 0.5,
    max_iter: int = 200,
    convergence_iter: int = 15,
    copy: bool = True,
    preference: ArrayLike | float | None = None,
    affinity: str = "euclidean",
    verbose: bool = False,
    random_state: int | RandomState | None = None
)

Perform Affinity Propagation Clustering of data.

SpectralClustering { .api }

from sklearn.cluster import SpectralClustering

SpectralClustering(
    n_clusters: int = 8,
    eigen_solver: str | None = None,
    n_components: int | None = None,
    random_state: int | RandomState | None = None,
    n_init: int = 10,
    gamma: float = 1.0,
    affinity: str | Callable = "rbf",
    n_neighbors: int = 10,
    eigen_tol: float | str = "auto",
    assign_labels: str = "kmeans",
    degree: float = 3,
    coef0: float = 1,
    kernel_params: dict | None = None,
    n_jobs: int | None = None,
    verbose: bool = False
)

Apply clustering to a projection of the normalized Laplacian.

SpectralBiclustering { .api }

from sklearn.cluster import SpectralBiclustering

SpectralBiclustering(
    n_clusters: int | tuple = 3,
    method: str = "bistochastic",
    n_components: int = 6,
    n_best: int = 3,
    svd_method: str = "randomized",
    n_svd_vecs: int | None = None,
    mini_batch: bool = False,
    init: str | ArrayLike = "k-means++",
    n_init: int = 10,
    random_state: int | RandomState | None = None
)

Spectral biclustering (Kluger, 2003).

SpectralCoclustering { .api }

from sklearn.cluster import SpectralCoclustering

SpectralCoclustering(
    n_clusters: int = 3,
    svd_method: str = "randomized",
    n_svd_vecs: int | None = None,
    mini_batch: bool = False,
    init: str | ArrayLike = "k-means++",
    n_init: int = 10,
    random_state: int | RandomState | None = None
)

Spectral Co-Clustering algorithm (Dhillon, 2001).

Clustering Functions

k_means { .api }

from sklearn.cluster import k_means

k_means(
    X: ArrayLike,
    n_clusters: int,
    sample_weight: ArrayLike | None = None,
    init: str | ArrayLike | Callable = "k-means++",
    n_init: int | str = 10,
    max_iter: int = 300,
    verbose: bool = False,
    tol: float = 0.0001,
    random_state: int | RandomState | None = None,
    copy_x: bool = True,
    algorithm: str = "lloyd",
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike, float, int] | tuple[ArrayLike, ArrayLike, float]

K-means clustering algorithm.

kmeans_plusplus { .api }

from sklearn.cluster import kmeans_plusplus

kmeans_plusplus(
    X: ArrayLike,
    n_clusters: int,
    x_squared_norms: ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    n_local_trials: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Init n_clusters seeds according to k-means++.

dbscan { .api }

from sklearn.cluster import dbscan

dbscan(
    X: ArrayLike,
    eps: float = 0.5,
    min_samples: int = 5,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    p: float | None = None,
    sample_weight: ArrayLike | None = None,
    n_jobs: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Perform DBSCAN clustering from vector array or distance matrix.

affinity_propagation { .api }

from sklearn.cluster import affinity_propagation

affinity_propagation(
    S: ArrayLike,
    preference: ArrayLike | float | None = None,
    convergence_iter: int = 15,
    max_iter: int = 200,
    damping: float = 0.5,
    copy: bool = True,
    verbose: bool = False,
    return_n_iter: bool = False,
    random_state: int | RandomState | None = None
) -> tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike]

Perform Affinity Propagation Clustering of data.

spectral_clustering { .api }

from sklearn.cluster import spectral_clustering

spectral_clustering(
    affinity: ArrayLike,
    n_clusters: int = 8,
    n_components: int | None = None,
    eigen_solver: str | None = None,
    random_state: int | RandomState | None = None,
    n_init: int = 10,
    eigen_tol: float | str = "auto",
    assign_labels: str = "kmeans",
    verbose: bool = False
) -> ArrayLike

Apply clustering to a projection of the normalized Laplacian.

mean_shift { .api }

from sklearn.cluster import mean_shift

mean_shift(
    X: ArrayLike,
    bandwidth: float | None = None,
    seeds: ArrayLike | None = None,
    bin_seeding: bool = False,
    min_bin_freq: int = 1,
    cluster_all: bool = True,
    max_iter: int = 300,
    n_jobs: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Perform mean shift clustering of data using a flat kernel.

estimate_bandwidth { .api }

from sklearn.cluster import estimate_bandwidth

estimate_bandwidth(
    X: ArrayLike,
    quantile: float = 0.3,
    n_samples: int | None = None,
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
) -> float

Estimate the bandwidth to use with the mean-shift algorithm.

ward_tree { .api }

from sklearn.cluster import ward_tree

ward_tree(
    X: ArrayLike,
    connectivity: ArrayLike | None = None,
    n_clusters: int | None = None,
    return_distance: bool = False
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]

Ward clustering based on a Feature matrix.

linkage_tree { .api }

from sklearn.cluster import linkage_tree

linkage_tree(
    X: ArrayLike,
    connectivity: ArrayLike | None = None,
    n_clusters: int | None = None,
    linkage: str = "complete",
    affinity: str = "euclidean",
    return_distance: bool = False
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]

Linkage agglomerative clustering based on a Feature matrix.

get_bin_seeds { .api }

from sklearn.cluster import get_bin_seeds

get_bin_seeds(
    X: ArrayLike,
    bin_size: float,
    min_bin_freq: int = 1
) -> ArrayLike

Find seeds for mean_shift.

cluster_optics_dbscan { .api }

from sklearn.cluster import cluster_optics_dbscan

cluster_optics_dbscan(
    reachability: ArrayLike,
    core_distances: ArrayLike,
    ordering: ArrayLike,
    eps: float
) -> ArrayLike

Performs DBSCAN extraction for an arbitrary epsilon.

cluster_optics_xi { .api }

from sklearn.cluster import cluster_optics_xi

cluster_optics_xi(
    reachability: ArrayLike,
    predecessor: ArrayLike,
    ordering: ArrayLike,
    min_samples: int,
    min_cluster_size: int | float | None = None,
    xi: float = 0.05,
    predecessor_correction: bool = True
) -> tuple[ArrayLike, ArrayLike]

Automatically extract clusters according to the Xi-steep method.

compute_optics_graph { .api }

from sklearn.cluster import compute_optics_graph

compute_optics_graph(
    X: ArrayLike,
    min_samples: int,
    max_eps: float,
    metric: str | Callable,
    p: int,
    metric_params: dict | None,
    algorithm: str,
    leaf_size: int,
    n_jobs: int | None
) -> ArrayLike

Compute the OPTICS reachability graph.

Dimensionality Reduction

Principal Component Analysis

PCA { .api }

from sklearn.decomposition import PCA

PCA(
    n_components: int | float | str | None = None,
    copy: bool = True,
    whiten: bool = False,
    svd_solver: str = "auto",
    tol: float = 0.0,
    iterated_power: int | str = "auto",
    n_oversamples: int = 10,
    power_iteration_normalizer: str = "auto",
    random_state: int | RandomState | None = None
)

Principal component analysis (PCA).

IncrementalPCA { .api }

from sklearn.decomposition import IncrementalPCA

IncrementalPCA(
    n_components: int | None = None,
    whiten: bool = False,
    copy: bool = True,
    batch_size: int | None = None
)

Incremental principal components analysis (IPCA).

KernelPCA { .api }

from sklearn.decomposition import KernelPCA

KernelPCA(
    n_components: int | None = None,
    kernel: str | Callable = "linear",
    gamma: float | None = None,
    degree: int = 3,
    coef0: float = 1,
    kernel_params: dict | None = None,
    alpha: float = 1.0,
    fit_inverse_transform: bool = False,
    eigen_solver: str = "auto",
    tol: float = 0,
    max_iter: int | None = None,
    iterated_power: int | str = "auto",
    remove_zero_eig: bool = False,
    random_state: int | RandomState | None = None,
    copy_X: bool = True,
    n_jobs: int | None = None
)

Kernel Principal component analysis (KPCA).

SparsePCA { .api }

from sklearn.decomposition import SparsePCA

SparsePCA(
    n_components: int | None = None,
    alpha: float = 1,
    ridge_alpha: float = 0.01,
    max_iter: int = 1000,
    tol: float = 1e-08,
    method: str = "lars",
    n_jobs: int | None = None,
    U_init: ArrayLike | None = None,
    V_init: ArrayLike | None = None,
    verbose: bool | int = False,
    random_state: int | RandomState | None = None
)

Sparse Principal Components Analysis (SparsePCA).

MiniBatchSparsePCA { .api }

from sklearn.decomposition import MiniBatchSparsePCA

MiniBatchSparsePCA(
    n_components: int | None = None,
    alpha: float = 1,
    ridge_alpha: float = 0.01,
    n_iter: int = 100,
    callback: Callable | None = None,
    batch_size: int = 3,
    verbose: bool | int = False,
    shuffle: bool = True,
    n_jobs: int | None = None,
    method: str = "lars",
    random_state: int | RandomState | None = None
)

Mini-batch Sparse Principal Components Analysis.

TruncatedSVD { .api }

from sklearn.decomposition import TruncatedSVD

TruncatedSVD(
    n_components: int = 2,
    algorithm: str = "randomized",
    n_iter: int = 5,
    n_oversamples: int = 10,
    power_iteration_normalizer: str = "auto",
    random_state: int | RandomState | None = None,
    tol: float = 0.0
)

Dimensionality reduction using truncated SVD (aka LSA).

Independent Component Analysis

FastICA { .api }

from sklearn.decomposition import FastICA

FastICA(
    n_components: int | None = None,
    algorithm: str = "parallel",
    whiten: str | bool = "unit-variance",
    fun: str | Callable = "logcosh",
    fun_args: dict | None = None,
    max_iter: int = 200,
    tol: float = 0.0001,
    w_init: ArrayLike | None = None,
    whiten_solver: str = "svd",
    random_state: int | RandomState | None = None
)

FastICA: a fast algorithm for Independent Component Analysis.

Factor Analysis

FactorAnalysis { .api }

from sklearn.decomposition import FactorAnalysis

FactorAnalysis(
    n_components: int | None = None,
    tol: float = 0.01,
    copy: bool = True,
    max_iter: int = 1000,
    noise_variance_init: ArrayLike | None = None,
    svd_method: str = "randomized",
    iterated_power: int = 3,
    rotation: str | None = None,
    random_state: int | RandomState | None = None
)

Factor Analysis (FA).

Dictionary Learning

DictionaryLearning { .api }

from sklearn.decomposition import DictionaryLearning

DictionaryLearning(
    n_components: int | None = None,
    alpha: float = 1,
    max_iter: int = 1000,
    tol: float = 1e-08,
    fit_algorithm: str = "lars",
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    n_jobs: int | None = None,
    code_init: ArrayLike | None = None,
    dict_init: ArrayLike | None = None,
    verbose: bool = False,
    split_sign: bool = False,
    random_state: int | RandomState | None = None,
    positive_code: bool = False,
    positive_dict: bool = False,
    transform_max_iter: int = 1000
)

Dictionary learning.

MiniBatchDictionaryLearning { .api }

from sklearn.decomposition import MiniBatchDictionaryLearning

MiniBatchDictionaryLearning(
    n_components: int | None = None,
    alpha: float = 1,
    max_iter: int = 1000,
    fit_algorithm: str = "lars",
    n_jobs: int | None = None,
    batch_size: int = 256,
    shuffle: bool = True,
    dict_init: ArrayLike | None = None,
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    verbose: bool = False,
    split_sign: bool = False,
    random_state: int | RandomState | None = None,
    positive_code: bool = False,
    positive_dict: bool = False,
    transform_max_iter: int = 1000
)

Mini-batch dictionary learning.

SparseCoder { .api }

from sklearn.decomposition import SparseCoder

SparseCoder(
    dictionary: ArrayLike,
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    split_sign: bool = False,
    n_jobs: int | None = None,
    positive_code: bool = False,
    transform_max_iter: int = 1000
)

Sparse coding.

Non-negative Matrix Factorization

NMF { .api }

from sklearn.decomposition import NMF

NMF(
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    solver: str = "cd",
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_iter: int = 200,
    random_state: int | RandomState | None = None,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    verbose: int = 0,
    shuffle: bool = False
)

Non-negative Matrix Factorization (NMF).

MiniBatchNMF { .api }

from sklearn.decomposition import MiniBatchNMF

MiniBatchNMF(
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    batch_size: int = 1024,
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_no_improvement: int = 10,
    max_iter: int = 200,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    forget_factor: float = 0.7,
    fresh_restarts: bool = False,
    fresh_restarts_max_iter: int = 30,
    transform_max_iter: int | None = None,
    random_state: int | RandomState | None = None,
    verbose: int = 0
)

Mini-Batch Non-Negative Matrix Factorization (NMF).

Latent Dirichlet Allocation

LatentDirichletAllocation { .api }

from sklearn.decomposition import LatentDirichletAllocation

LatentDirichletAllocation(
    n_components: int = 10,
    doc_topic_prior: float | None = None,
    topic_word_prior: float | None = None,
    learning_method: str = "batch",
    learning_decay: float = 0.7,
    learning_offset: float = 10.0,
    max_iter: int = 10,
    batch_size: int = 128,
    evaluate_every: int = 0,
    total_samples: int = 1000000.0,
    perp_tol: float = 0.1,
    mean_change_tol: float = 0.001,
    max_doc_update_iter: int = 100,
    n_jobs: int | None = None,
    verbose: int = 0,
    random_state: int | RandomState | None = None
)

Latent Dirichlet Allocation with online variational Bayes algorithm.

Decomposition Functions

randomized_svd { .api }

from sklearn.decomposition import randomized_svd

randomized_svd(
    M: ArrayLike,
    n_components: int,
    n_oversamples: int = 10,
    n_iter: int | str = "auto",
    power_iteration_normalizer: str = "auto",
    transpose: bool | str = "auto",
    flip_sign: bool = True,
    random_state: int | RandomState | None = None,
    svd_lapack_driver: str = "gesdd"
) -> tuple[ArrayLike, ArrayLike, ArrayLike]

Compute a truncated randomized SVD.

fastica { .api }

from sklearn.decomposition import fastica

fastica(
    X: ArrayLike,
    n_components: int | None = None,
    algorithm: str = "parallel",
    whiten: str | bool = "unit-variance",
    fun: str | Callable = "logcosh",
    fun_args: dict | None = None,
    max_iter: int = 200,
    tol: float = 0.0001,
    w_init: ArrayLike | None = None,
    whiten_solver: str = "svd",
    random_state: int | RandomState | None = None,
    return_X_mean: bool = False,
    compute_sources: bool = True,
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike, int]

Perform Fast Independent Component Analysis.

dict_learning { .api }

from sklearn.decomposition import dict_learning

dict_learning(
    X: ArrayLike,
    n_components: int,
    alpha: float,
    max_iter: int = 100,
    tol: float = 1e-08,
    method: str = "lars",
    n_jobs: int | None = None,
    dict_init: ArrayLike | None = None,
    code_init: ArrayLike | None = None,
    callback: Callable | None = None,
    verbose: bool = False,
    random_state: int | RandomState | None = None,
    return_n_iter: bool = False,
    positive_dict: bool = False,
    positive_code: bool = False,
    method_max_iter: int = 1000
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int]

Solve a dictionary learning matrix factorization problem.

dict_learning_online { .api }

from sklearn.decomposition import dict_learning_online

dict_learning_online(
    X: ArrayLike,
    n_components: int = 2,
    alpha: float = 1,
    max_iter: int = 100,
    return_code: bool = True,
    dict_init: ArrayLike | None = None,
    callback: Callable | None = None,
    batch_size: int = 256,
    verbose: bool = False,
    shuffle: bool = True,
    n_jobs: int | None = None,
    method: str = "lars",
    iter_offset: int = 0,
    random_state: int | RandomState | None = None,
    return_inner_stats: bool = False,
    inner_stats: tuple | None = None,
    return_n_iter: bool = False,
    positive_dict: bool = False,
    positive_code: bool = False,
    method_max_iter: int = 1000
) -> ArrayLike | tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, tuple] | tuple[ArrayLike, ArrayLike, tuple] | tuple[ArrayLike, int] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, tuple, int] | tuple[ArrayLike, ArrayLike, tuple, int]

Solve a dictionary learning matrix factorization problem online.

sparse_encode { .api }

from sklearn.decomposition import sparse_encode

sparse_encode(
    X: ArrayLike,
    dictionary: ArrayLike,
    gram: ArrayLike | None = None,
    cov: ArrayLike | None = None,
    algorithm: str = "lasso_lars",
    n_nonzero_coefs: int | None = None,
    alpha: float | None = None,
    copy_cov: bool = True,
    init: ArrayLike | None = None,
    max_iter: int = 1000,
    n_jobs: int | None = None,
    check_input: bool = True,
    verbose: int = 0,
    positive: bool = False
) -> ArrayLike

Sparse coding.

non_negative_factorization { .api }

from sklearn.decomposition import non_negative_factorization

non_negative_factorization(
    X: ArrayLike,
    W: ArrayLike | None = None,
    H: ArrayLike | None = None,
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    update_H: bool = True,
    solver: str = "cd",
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_iter: int = 200,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    regularization: str | None = None,
    random_state: int | RandomState | None = None,
    verbose: int = 0,
    shuffle: bool = False
) -> tuple[ArrayLike, ArrayLike, int]

Compute Non-negative Matrix Factorization (NMF).

Manifold Learning

Isomap { .api }

from sklearn.manifold import Isomap

Isomap(
    n_neighbors: int = 5,
    radius: float | None = None,
    n_components: int = 2,
    eigen_solver: str = "auto",
    tol: float = 0,
    max_iter: int | None = None,
    path_method: str = "auto",
    neighbors_algorithm: str = "auto",
    n_jobs: int | None = None,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None
)

Isomap Embedding.

LocallyLinearEmbedding { .api }

from sklearn.manifold import LocallyLinearEmbedding

LocallyLinearEmbedding(
    n_neighbors: int = 5,
    n_components: int = 2,
    reg: float = 0.001,
    eigen_solver: str = "auto",
    tol: float = 1e-06,
    max_iter: int = 100,
    method: str = "standard",
    hessian_tol: float = 0.0001,
    modified_tol: float = 1e-12,
    neighbors_algorithm: str = "auto",
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
)

Locally Linear Embedding.

MDS { .api }

from sklearn.manifold import MDS

MDS(
    n_components: int = 2,
    metric: bool = True,
    n_init: int = 4,
    max_iter: int = 300,
    verbose: int = 0,
    eps: float = 0.001,
    n_jobs: int | None = None,
    random_state: int | RandomState | None = None,
    dissimilarity: str = "euclidean",
    normalized_stress: str | bool = "auto"
)

Multidimensional scaling.

SpectralEmbedding { .api }

from sklearn.manifold import SpectralEmbedding

SpectralEmbedding(
    n_components: int = 2,
    affinity: str | Callable = "nearest_neighbors",
    gamma: float | None = None,
    random_state: int | RandomState | None = None,
    eigen_solver: str | None = None,
    n_neighbors: int | None = None,
    n_jobs: int | None = None
)

Spectral embedding for non-linear dimensionality reduction.

TSNE { .api }

from sklearn.manifold import TSNE

TSNE(
    n_components: int = 2,
    perplexity: float = 30.0,
    early_exaggeration: float = 12.0,
    learning_rate: float | str = "warn",
    n_iter: int = 1000,
    n_iter_without_progress: int = 300,
    min_grad_norm: float = 1e-07,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    init: str | ArrayLike = "warn",
    verbose: int = 0,
    random_state: int | RandomState | None = None,
    method: str = "barnes_hut",
    angle: float = 0.5,
    n_jobs: int | None = None,
    square_distances: str | bool = "deprecated"
)

t-distributed Stochastic Neighbor Embedding.

Manifold Learning Functions

locally_linear_embedding { .api }

from sklearn.manifold import locally_linear_embedding

locally_linear_embedding(
    X: ArrayLike,
    n_neighbors: int,
    n_components: int,
    reg: float = 0.001,
    eigen_solver: str = "auto",
    tol: float = 1e-06,
    max_iter: int = 100,
    method: str = "standard",
    hessian_tol: float = 0.0001,
    modified_tol: float = 1e-12,
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
) -> tuple[ArrayLike, float]

Perform a Locally Linear Embedding analysis on the data.

spectral_embedding { .api }

from sklearn.manifold import spectral_embedding

spectral_embedding(
    adjacency: ArrayLike,
    n_components: int = 8,
    eigen_solver: str | None = None,
    random_state: int | RandomState | None = None,
    eigen_tol: float | str = "auto",
    norm_laplacian: bool = True,
    drop_first: bool = True
) -> ArrayLike

Project the sample on the first eigenvectors of the graph Laplacian.

smacof { .api }

from sklearn.manifold import smacof

smacof(
    dissimilarities: ArrayLike,
    metric: bool = True,
    n_components: int = 2,
    init: ArrayLike | None = None,
    n_init: int = 8,
    n_jobs: int | None = None,
    max_iter: int = 300,
    verbose: int = 0,
    eps: float = 0.001,
    random_state: int | RandomState | None = None,
    return_n_iter: bool = False,
    normalized_stress: str | bool = "auto"
) -> tuple[ArrayLike, float, int] | tuple[ArrayLike, float]

Compute multidimensional scaling using the SMACOF algorithm.

trustworthiness { .api }

from sklearn.manifold import trustworthiness

trustworthiness(
    X: ArrayLike,
    X_embedded: ArrayLike,
    n_neighbors: int = 5,
    metric: str | Callable = "euclidean"
) -> float

Indicate to what extent the local structure is retained.

Mixture Models

GaussianMixture { .api }

from sklearn.mixture import GaussianMixture

GaussianMixture(
    n_components: int = 1,
    covariance_type: str = "full",
    tol: float = 0.001,
    reg_covar: float = 1e-06,
    max_iter: int = 100,
    n_init: int = 1,
    init_params: str = "kmeans",
    weights_init: ArrayLike | None = None,
    means_init: ArrayLike | None = None,
    precisions_init: ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    warm_start: bool = False,
    verbose: int = 0,
    verbose_interval: int = 10
)

Gaussian Mixture Model.

BayesianGaussianMixture { .api }

from sklearn.mixture import BayesianGaussianMixture

BayesianGaussianMixture(
    n_components: int = 1,
    covariance_type: str = "full",
    tol: float = 0.001,
    reg_covar: float = 1e-06,
    max_iter: int = 100,
    n_init: int = 1,
    init_params: str = "kmeans",
    weight_concentration_prior_type: str = "dirichlet_process",
    weight_concentration_prior: float | None = None,
    mean_precision_prior: float | None = None,
    mean_prior: ArrayLike | None = None,
    degrees_of_freedom_prior: float | None = None,
    covariance_prior: float | ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    warm_start: bool = False,
    verbose: int = 0,
    verbose_interval: int = 10
)

Variational Bayesian estimation of a Gaussian mixture.

Covariance Estimation

EmpiricalCovariance { .api }

from sklearn.covariance import EmpiricalCovariance

EmpiricalCovariance(
    store_precision: bool = True,
    assume_centered: bool = False
)

Maximum likelihood covariance estimator.

ShrunkCovariance { .api }

from sklearn.covariance import ShrunkCovariance

ShrunkCovariance(
    store_precision: bool = True,
    assume_centered: bool = False,
    shrinkage: float = 0.1
)

Covariance estimator with shrinkage.

LedoitWolf { .api }

from sklearn.covariance import LedoitWolf

LedoitWolf(
    store_precision: bool = True,
    assume_centered: bool = False,
    block_size: int = 1000
)

LedoitWolf Estimator.

OAS { .api }

from sklearn.covariance import OAS

OAS(
    store_precision: bool = True,
    assume_centered: bool = False
)

Oracle Approximating Shrinkage Estimator.

MinCovDet { .api }

from sklearn.covariance import MinCovDet

MinCovDet(
    store_precision: bool = True,
    assume_centered: bool = False,
    support_fraction: float | None = None,
    random_state: int | RandomState | None = None
)

Minimum Covariance Determinant (Robust covariance estimation).

GraphicalLasso { .api }

from sklearn.covariance import GraphicalLasso

GraphicalLasso(
    alpha: float = 0.01,
    mode: str = "cd",
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    verbose: bool = False,
    assume_centered: bool = False
)

Sparse inverse covariance estimation with an l1-penalized estimator.

GraphicalLassoCV { .api }

from sklearn.covariance import GraphicalLassoCV

GraphicalLassoCV(
    alphas: int | ArrayLike = 4,
    n_refinements: int = 4,
    cv: int | BaseCrossValidator | Iterable | None = None,
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    mode: str = "cd",
    n_jobs: int | None = None,
    verbose: bool = False,
    assume_centered: bool = False
)

Sparse inverse covariance w/ cross-validated choice of the l1 penalty.

EllipticEnvelope { .api }

from sklearn.covariance import EllipticEnvelope

EllipticEnvelope(
    store_precision: bool = True,
    assume_centered: bool = False,
    support_fraction: float | None = None,
    contamination: float = 0.1,
    random_state: int | RandomState | None = None
)

An object for detecting outliers in a Gaussian distributed dataset.

Covariance Functions

empirical_covariance { .api }

from sklearn.covariance import empirical_covariance

empirical_covariance(
    X: ArrayLike,
    assume_centered: bool = False
) -> ArrayLike

Compute the Maximum likelihood covariance estimator.

shrunk_covariance { .api }

from sklearn.covariance import shrunk_covariance

shrunk_covariance(
    emp_cov: ArrayLike,
    shrinkage: float = 0.1
) -> ArrayLike

Calculate a covariance matrix shrunk on the diagonal.

ledoit_wolf { .api }

from sklearn.covariance import ledoit_wolf

ledoit_wolf(
    X: ArrayLike,
    assume_centered: bool = False,
    block_size: int = 1000
) -> tuple[ArrayLike, float]

Estimate covariance with the Ledoit-Wolf estimator.

ledoit_wolf_shrinkage { .api }

from sklearn.covariance import ledoit_wolf_shrinkage

ledoit_wolf_shrinkage(
    X: ArrayLike,
    assume_centered: bool = False,
    block_size: int = 1000
) -> float

Calculate the Ledoit-Wolf shrinkage coefficient.

oas { .api }

from sklearn.covariance import oas

oas(
    X: ArrayLike,
    assume_centered: bool = False
) -> tuple[ArrayLike, float]

Estimate covariance with the Oracle Approximating Shrinkage algorithm.

fast_mcd { .api }

from sklearn.covariance import fast_mcd

fast_mcd(
    X: ArrayLike,
    support_fraction: float | None = None,
    cov_computation_method: Callable = ...,
    random_state: int | RandomState | None = None
) -> tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike]

Estimates the Minimum Covariance Determinant matrix.

graphical_lasso { .api }

from sklearn.covariance import graphical_lasso

graphical_lasso(
    emp_cov: ArrayLike,
    alpha: float,
    cov_init: ArrayLike | None = None,
    mode: str = "cd",
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    verbose: bool = False,
    return_costs: bool = False,
    eps: float = ...,
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, list] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, list, int]

L1-penalized covariance estimator.

log_likelihood { .api }

from sklearn.covariance import log_likelihood

log_likelihood(
    emp_cov: ArrayLike,
    precision: ArrayLike
) -> float

Compute the sample mean of the log_likelihood under a covariance model.

Cross Decomposition

CCA { .api }

from sklearn.cross_decomposition import CCA

CCA(
    n_components: int = 2,
    scale: bool = True,
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

Canonical Correlation Analysis.

PLSCanonical { .api }

from sklearn.cross_decomposition import PLSCanonical

PLSCanonical(
    n_components: int = 2,
    scale: bool = True,
    algorithm: str = "nipals",
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

Partial Least Squares transformer and regressor.

PLSRegression { .api }

from sklearn.cross_decomposition import PLSRegression

PLSRegression(
    n_components: int = 2,
    scale: bool = True,
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

PLS regression.

PLSSVD { .api }

from sklearn.cross_decomposition import PLSSVD

PLSSVD(
    n_components: int = 2,
    scale: bool = True,
    copy: bool = True
)

Partial Least Square SVD.

Outlier Detection

Outlier detection algorithms are also available in the ensemble module:

LocalOutlierFactor { .api }

from sklearn.neighbors import LocalOutlierFactor

LocalOutlierFactor(
    n_neighbors: int = 20,
    algorithm: str = "auto",
    leaf_size: int = 30,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None,
    contamination: float | str = "auto",
    novelty: bool = False,
    n_jobs: int | None = None
)

Unsupervised Outlier Detection using Local Outlier Factor (LOF).

Note: Additional outlier detection methods are available in:

  • sklearn.ensemble.IsolationForest - Isolation Forest Algorithm
  • sklearn.svm.OneClassSVM - One-Class Support Vector Machine
  • sklearn.covariance.EllipticEnvelope - Outlier detection for Gaussian data

Install with Tessl CLI

npx tessl i tessl/pypi-scikit-learn

docs

datasets.md

feature-extraction.md

index.md

metrics.md

model-selection.md

neighbors.md

pipelines.md

preprocessing.md

supervised-learning.md

unsupervised-learning.md

utilities.md

tile.json