tessl/pypi-scikit-learn

A comprehensive machine learning library providing supervised and unsupervised learning algorithms with consistent APIs and extensive tools for data preprocessing, model evaluation, and deployment.

0.98x

Overview

Eval results

Files

Unsupervised Learning

Name: tessl/pypi-scikit-learn
Rating: 0.87 (1 reviews)
Author: tessl

This document covers all unsupervised learning algorithms in scikit-learn, including clustering, dimensionality reduction, and mixture models.

Clustering

Core Clustering Algorithms

KMeans { .api }

from sklearn.cluster import KMeans

KMeans(
    n_clusters: int = 8,
    init: str | ArrayLike | Callable = "k-means++",
    n_init: int | str = "auto",
    max_iter: int = 300,
    tol: float = 0.0001,
    verbose: int = 0,
    random_state: int | RandomState | None = None,
    copy_x: bool = True,
    algorithm: str = "lloyd"
)

K-Means clustering.

MiniBatchKMeans { .api }

from sklearn.cluster import MiniBatchKMeans

MiniBatchKMeans(
    n_clusters: int = 8,
    init: str | ArrayLike | Callable = "k-means++",
    max_iter: int = 100,
    batch_size: int = 1024,
    verbose: int = 0,
    compute_labels: bool = True,
    random_state: int | RandomState | None = None,
    tol: float = 0.0,
    max_no_improvement: int = 10,
    init_size: int | None = None,
    n_init: int | str = 3,
    reassignment_ratio: float = 0.01
)

Mini-Batch K-Means clustering.

BisectingKMeans { .api }

from sklearn.cluster import BisectingKMeans

BisectingKMeans(
    n_clusters: int = 8,
    init: str | Callable = "random",
    n_init: int = 1,
    random_state: int | RandomState | None = None,
    max_iter: int = 300,
    verbose: int = 0,
    tol: float = 0.0001,
    copy_x: bool = True,
    algorithm: str = "lloyd",
    bisecting_strategy: str = "biggest_inertia"
)

Bisecting K-Means clustering.

DBSCAN { .api }

from sklearn.cluster import DBSCAN

DBSCAN(
    eps: float = 0.5,
    min_samples: int = 5,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    p: float | None = None,
    n_jobs: int | None = None
)

Perform DBSCAN clustering from vector array or distance matrix.

HDBSCAN { .api }

from sklearn.cluster import HDBSCAN

HDBSCAN(
    min_cluster_size: int = 5,
    min_samples: int | None = None,
    cluster_selection_epsilon: float = 0.0,
    max_cluster_size: int | None = None,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    alpha: float = 1.0,
    algorithm: str = "auto",
    leaf_size: int = 40,
    n_jobs: int | None = None,
    cluster_selection_method: str = "eom",
    allow_single_cluster: bool = False,
    store_centers: str | None = None,
    copy: bool = True
)

Perform HDBSCAN clustering from vector array or distance matrix.

OPTICS { .api }

from sklearn.cluster import OPTICS

OPTICS(
    min_samples: int = 5,
    max_eps: float = ...,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None,
    cluster_method: str = "xi",
    eps: float | None = None,
    xi: float = 0.05,
    predecessor_correction: bool = True,
    min_cluster_size: int | float | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    memory: str | object | None = None,
    n_jobs: int | None = None
)

Estimate clustering structure from vector array.

MeanShift { .api }

from sklearn.cluster import MeanShift

MeanShift(
    bandwidth: float | None = None,
    seeds: ArrayLike | None = None,
    bin_seeding: bool = False,
    min_bin_freq: int = 1,
    cluster_all: bool = True,
    n_jobs: int | None = None,
    max_iter: int = 300
)

Mean shift clustering using a flat kernel.

AgglomerativeClustering { .api }

from sklearn.cluster import AgglomerativeClustering

AgglomerativeClustering(
    n_clusters: int | None = 2,
    metric: str | Callable | None = None,
    memory: str | object | None = None,
    connectivity: ArrayLike | Callable | None = None,
    compute_full_tree: bool | str = "auto",
    linkage: str = "ward",
    distance_threshold: float | None = None,
    compute_distances: bool = False
)

Agglomerative Clustering.

FeatureAgglomeration { .api }

from sklearn.cluster import FeatureAgglomeration

FeatureAgglomeration(
    n_clusters: int | None = 2,
    metric: str | Callable | None = None,
    memory: str | object | None = None,
    connectivity: ArrayLike | Callable | None = None,
    compute_full_tree: bool | str = "auto",
    linkage: str = "ward",
    pooling_func: Callable = ...,
    distance_threshold: float | None = None,
    compute_distances: bool = False
)

Agglomerate features.

Birch { .api }

from sklearn.cluster import Birch

Birch(
    n_clusters: int | None = 3,
    threshold: float = 0.5,
    branching_factor: int = 50,
    compute_labels: bool = True,
    copy: bool = True
)

Implements the BIRCH clustering algorithm.

AffinityPropagation { .api }

from sklearn.cluster import AffinityPropagation

AffinityPropagation(
    damping: float = 0.5,
    max_iter: int = 200,
    convergence_iter: int = 15,
    copy: bool = True,
    preference: ArrayLike | float | None = None,
    affinity: str = "euclidean",
    verbose: bool = False,
    random_state: int | RandomState | None = None
)

Perform Affinity Propagation Clustering of data.

SpectralClustering { .api }

from sklearn.cluster import SpectralClustering

SpectralClustering(
    n_clusters: int = 8,
    eigen_solver: str | None = None,
    n_components: int | None = None,
    random_state: int | RandomState | None = None,
    n_init: int = 10,
    gamma: float = 1.0,
    affinity: str | Callable = "rbf",
    n_neighbors: int = 10,
    eigen_tol: float | str = "auto",
    assign_labels: str = "kmeans",
    degree: float = 3,
    coef0: float = 1,
    kernel_params: dict | None = None,
    n_jobs: int | None = None,
    verbose: bool = False
)

Apply clustering to a projection of the normalized Laplacian.

SpectralBiclustering { .api }

from sklearn.cluster import SpectralBiclustering

SpectralBiclustering(
    n_clusters: int | tuple = 3,
    method: str = "bistochastic",
    n_components: int = 6,
    n_best: int = 3,
    svd_method: str = "randomized",
    n_svd_vecs: int | None = None,
    mini_batch: bool = False,
    init: str | ArrayLike = "k-means++",
    n_init: int = 10,
    random_state: int | RandomState | None = None
)

Spectral biclustering (Kluger, 2003).

SpectralCoclustering { .api }

from sklearn.cluster import SpectralCoclustering

SpectralCoclustering(
    n_clusters: int = 3,
    svd_method: str = "randomized",
    n_svd_vecs: int | None = None,
    mini_batch: bool = False,
    init: str | ArrayLike = "k-means++",
    n_init: int = 10,
    random_state: int | RandomState | None = None
)

Spectral Co-Clustering algorithm (Dhillon, 2001).

Clustering Functions

k_means { .api }

from sklearn.cluster import k_means

k_means(
    X: ArrayLike,
    n_clusters: int,
    sample_weight: ArrayLike | None = None,
    init: str | ArrayLike | Callable = "k-means++",
    n_init: int | str = 10,
    max_iter: int = 300,
    verbose: bool = False,
    tol: float = 0.0001,
    random_state: int | RandomState | None = None,
    copy_x: bool = True,
    algorithm: str = "lloyd",
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike, float, int] | tuple[ArrayLike, ArrayLike, float]

K-means clustering algorithm.

kmeans_plusplus { .api }

from sklearn.cluster import kmeans_plusplus

kmeans_plusplus(
    X: ArrayLike,
    n_clusters: int,
    x_squared_norms: ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    n_local_trials: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Init n_clusters seeds according to k-means++.

dbscan { .api }

from sklearn.cluster import dbscan

dbscan(
    X: ArrayLike,
    eps: float = 0.5,
    min_samples: int = 5,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    algorithm: str = "auto",
    leaf_size: int = 30,
    p: float | None = None,
    sample_weight: ArrayLike | None = None,
    n_jobs: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Perform DBSCAN clustering from vector array or distance matrix.

affinity_propagation { .api }

from sklearn.cluster import affinity_propagation

affinity_propagation(
    S: ArrayLike,
    preference: ArrayLike | float | None = None,
    convergence_iter: int = 15,
    max_iter: int = 200,
    damping: float = 0.5,
    copy: bool = True,
    verbose: bool = False,
    return_n_iter: bool = False,
    random_state: int | RandomState | None = None
) -> tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike]

Perform Affinity Propagation Clustering of data.

spectral_clustering { .api }

from sklearn.cluster import spectral_clustering

spectral_clustering(
    affinity: ArrayLike,
    n_clusters: int = 8,
    n_components: int | None = None,
    eigen_solver: str | None = None,
    random_state: int | RandomState | None = None,
    n_init: int = 10,
    eigen_tol: float | str = "auto",
    assign_labels: str = "kmeans",
    verbose: bool = False
) -> ArrayLike

Apply clustering to a projection of the normalized Laplacian.

mean_shift { .api }

from sklearn.cluster import mean_shift

mean_shift(
    X: ArrayLike,
    bandwidth: float | None = None,
    seeds: ArrayLike | None = None,
    bin_seeding: bool = False,
    min_bin_freq: int = 1,
    cluster_all: bool = True,
    max_iter: int = 300,
    n_jobs: int | None = None
) -> tuple[ArrayLike, ArrayLike]

Perform mean shift clustering of data using a flat kernel.

estimate_bandwidth { .api }

from sklearn.cluster import estimate_bandwidth

estimate_bandwidth(
    X: ArrayLike,
    quantile: float = 0.3,
    n_samples: int | None = None,
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
) -> float

Estimate the bandwidth to use with the mean-shift algorithm.

ward_tree { .api }

from sklearn.cluster import ward_tree

ward_tree(
    X: ArrayLike,
    connectivity: ArrayLike | None = None,
    n_clusters: int | None = None,
    return_distance: bool = False
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]

Ward clustering based on a Feature matrix.

linkage_tree { .api }

from sklearn.cluster import linkage_tree

linkage_tree(
    X: ArrayLike,
    connectivity: ArrayLike | None = None,
    n_clusters: int | None = None,
    linkage: str = "complete",
    affinity: str = "euclidean",
    return_distance: bool = False
) -> tuple[ArrayLike, int, int, ArrayLike, ArrayLike] | tuple[ArrayLike, int, int, ArrayLike]

Linkage agglomerative clustering based on a Feature matrix.

get_bin_seeds { .api }

from sklearn.cluster import get_bin_seeds

get_bin_seeds(
    X: ArrayLike,
    bin_size: float,
    min_bin_freq: int = 1
) -> ArrayLike

Find seeds for mean_shift.

cluster_optics_dbscan { .api }

from sklearn.cluster import cluster_optics_dbscan

cluster_optics_dbscan(
    reachability: ArrayLike,
    core_distances: ArrayLike,
    ordering: ArrayLike,
    eps: float
) -> ArrayLike

Performs DBSCAN extraction for an arbitrary epsilon.

cluster_optics_xi { .api }

from sklearn.cluster import cluster_optics_xi

cluster_optics_xi(
    reachability: ArrayLike,
    predecessor: ArrayLike,
    ordering: ArrayLike,
    min_samples: int,
    min_cluster_size: int | float | None = None,
    xi: float = 0.05,
    predecessor_correction: bool = True
) -> tuple[ArrayLike, ArrayLike]

Automatically extract clusters according to the Xi-steep method.

compute_optics_graph { .api }

from sklearn.cluster import compute_optics_graph

compute_optics_graph(
    X: ArrayLike,
    min_samples: int,
    max_eps: float,
    metric: str | Callable,
    p: int,
    metric_params: dict | None,
    algorithm: str,
    leaf_size: int,
    n_jobs: int | None
) -> ArrayLike

Compute the OPTICS reachability graph.

Dimensionality Reduction

Principal Component Analysis

PCA { .api }

from sklearn.decomposition import PCA

PCA(
    n_components: int | float | str | None = None,
    copy: bool = True,
    whiten: bool = False,
    svd_solver: str = "auto",
    tol: float = 0.0,
    iterated_power: int | str = "auto",
    n_oversamples: int = 10,
    power_iteration_normalizer: str = "auto",
    random_state: int | RandomState | None = None
)

Principal component analysis (PCA).

IncrementalPCA { .api }

from sklearn.decomposition import IncrementalPCA

IncrementalPCA(
    n_components: int | None = None,
    whiten: bool = False,
    copy: bool = True,
    batch_size: int | None = None
)

Incremental principal components analysis (IPCA).

KernelPCA { .api }

from sklearn.decomposition import KernelPCA

KernelPCA(
    n_components: int | None = None,
    kernel: str | Callable = "linear",
    gamma: float | None = None,
    degree: int = 3,
    coef0: float = 1,
    kernel_params: dict | None = None,
    alpha: float = 1.0,
    fit_inverse_transform: bool = False,
    eigen_solver: str = "auto",
    tol: float = 0,
    max_iter: int | None = None,
    iterated_power: int | str = "auto",
    remove_zero_eig: bool = False,
    random_state: int | RandomState | None = None,
    copy_X: bool = True,
    n_jobs: int | None = None
)

Kernel Principal component analysis (KPCA).

SparsePCA { .api }

from sklearn.decomposition import SparsePCA

SparsePCA(
    n_components: int | None = None,
    alpha: float = 1,
    ridge_alpha: float = 0.01,
    max_iter: int = 1000,
    tol: float = 1e-08,
    method: str = "lars",
    n_jobs: int | None = None,
    U_init: ArrayLike | None = None,
    V_init: ArrayLike | None = None,
    verbose: bool | int = False,
    random_state: int | RandomState | None = None
)

Sparse Principal Components Analysis (SparsePCA).

MiniBatchSparsePCA { .api }

from sklearn.decomposition import MiniBatchSparsePCA

MiniBatchSparsePCA(
    n_components: int | None = None,
    alpha: float = 1,
    ridge_alpha: float = 0.01,
    n_iter: int = 100,
    callback: Callable | None = None,
    batch_size: int = 3,
    verbose: bool | int = False,
    shuffle: bool = True,
    n_jobs: int | None = None,
    method: str = "lars",
    random_state: int | RandomState | None = None
)

Mini-batch Sparse Principal Components Analysis.

TruncatedSVD { .api }

from sklearn.decomposition import TruncatedSVD

TruncatedSVD(
    n_components: int = 2,
    algorithm: str = "randomized",
    n_iter: int = 5,
    n_oversamples: int = 10,
    power_iteration_normalizer: str = "auto",
    random_state: int | RandomState | None = None,
    tol: float = 0.0
)

Dimensionality reduction using truncated SVD (aka LSA).

Independent Component Analysis

FastICA { .api }

from sklearn.decomposition import FastICA

FastICA(
    n_components: int | None = None,
    algorithm: str = "parallel",
    whiten: str | bool = "unit-variance",
    fun: str | Callable = "logcosh",
    fun_args: dict | None = None,
    max_iter: int = 200,
    tol: float = 0.0001,
    w_init: ArrayLike | None = None,
    whiten_solver: str = "svd",
    random_state: int | RandomState | None = None
)

FastICA: a fast algorithm for Independent Component Analysis.

Factor Analysis

FactorAnalysis { .api }

from sklearn.decomposition import FactorAnalysis

FactorAnalysis(
    n_components: int | None = None,
    tol: float = 0.01,
    copy: bool = True,
    max_iter: int = 1000,
    noise_variance_init: ArrayLike | None = None,
    svd_method: str = "randomized",
    iterated_power: int = 3,
    rotation: str | None = None,
    random_state: int | RandomState | None = None
)

Factor Analysis (FA).

Dictionary Learning

DictionaryLearning { .api }

from sklearn.decomposition import DictionaryLearning

DictionaryLearning(
    n_components: int | None = None,
    alpha: float = 1,
    max_iter: int = 1000,
    tol: float = 1e-08,
    fit_algorithm: str = "lars",
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    n_jobs: int | None = None,
    code_init: ArrayLike | None = None,
    dict_init: ArrayLike | None = None,
    verbose: bool = False,
    split_sign: bool = False,
    random_state: int | RandomState | None = None,
    positive_code: bool = False,
    positive_dict: bool = False,
    transform_max_iter: int = 1000
)

Dictionary learning.

MiniBatchDictionaryLearning { .api }

from sklearn.decomposition import MiniBatchDictionaryLearning

MiniBatchDictionaryLearning(
    n_components: int | None = None,
    alpha: float = 1,
    max_iter: int = 1000,
    fit_algorithm: str = "lars",
    n_jobs: int | None = None,
    batch_size: int = 256,
    shuffle: bool = True,
    dict_init: ArrayLike | None = None,
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    verbose: bool = False,
    split_sign: bool = False,
    random_state: int | RandomState | None = None,
    positive_code: bool = False,
    positive_dict: bool = False,
    transform_max_iter: int = 1000
)

Mini-batch dictionary learning.

SparseCoder { .api }

from sklearn.decomposition import SparseCoder

SparseCoder(
    dictionary: ArrayLike,
    transform_algorithm: str = "omp",
    transform_n_nonzero_coefs: int | None = None,
    transform_alpha: float | None = None,
    split_sign: bool = False,
    n_jobs: int | None = None,
    positive_code: bool = False,
    transform_max_iter: int = 1000
)

Sparse coding.

Non-negative Matrix Factorization

NMF { .api }

from sklearn.decomposition import NMF

NMF(
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    solver: str = "cd",
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_iter: int = 200,
    random_state: int | RandomState | None = None,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    verbose: int = 0,
    shuffle: bool = False
)

Non-negative Matrix Factorization (NMF).

MiniBatchNMF { .api }

from sklearn.decomposition import MiniBatchNMF

MiniBatchNMF(
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    batch_size: int = 1024,
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_no_improvement: int = 10,
    max_iter: int = 200,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    forget_factor: float = 0.7,
    fresh_restarts: bool = False,
    fresh_restarts_max_iter: int = 30,
    transform_max_iter: int | None = None,
    random_state: int | RandomState | None = None,
    verbose: int = 0
)

Mini-Batch Non-Negative Matrix Factorization (NMF).

Latent Dirichlet Allocation

LatentDirichletAllocation { .api }

from sklearn.decomposition import LatentDirichletAllocation

LatentDirichletAllocation(
    n_components: int = 10,
    doc_topic_prior: float | None = None,
    topic_word_prior: float | None = None,
    learning_method: str = "batch",
    learning_decay: float = 0.7,
    learning_offset: float = 10.0,
    max_iter: int = 10,
    batch_size: int = 128,
    evaluate_every: int = 0,
    total_samples: int = 1000000.0,
    perp_tol: float = 0.1,
    mean_change_tol: float = 0.001,
    max_doc_update_iter: int = 100,
    n_jobs: int | None = None,
    verbose: int = 0,
    random_state: int | RandomState | None = None
)

Latent Dirichlet Allocation with online variational Bayes algorithm.

Decomposition Functions

randomized_svd { .api }

from sklearn.decomposition import randomized_svd

randomized_svd(
    M: ArrayLike,
    n_components: int,
    n_oversamples: int = 10,
    n_iter: int | str = "auto",
    power_iteration_normalizer: str = "auto",
    transpose: bool | str = "auto",
    flip_sign: bool = True,
    random_state: int | RandomState | None = None,
    svd_lapack_driver: str = "gesdd"
) -> tuple[ArrayLike, ArrayLike, ArrayLike]

Compute a truncated randomized SVD.

fastica { .api }

from sklearn.decomposition import fastica

fastica(
    X: ArrayLike,
    n_components: int | None = None,
    algorithm: str = "parallel",
    whiten: str | bool = "unit-variance",
    fun: str | Callable = "logcosh",
    fun_args: dict | None = None,
    max_iter: int = 200,
    tol: float = 0.0001,
    w_init: ArrayLike | None = None,
    whiten_solver: str = "svd",
    random_state: int | RandomState | None = None,
    return_X_mean: bool = False,
    compute_sources: bool = True,
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike, int]

Perform Fast Independent Component Analysis.

dict_learning { .api }

from sklearn.decomposition import dict_learning

dict_learning(
    X: ArrayLike,
    n_components: int,
    alpha: float,
    max_iter: int = 100,
    tol: float = 1e-08,
    method: str = "lars",
    n_jobs: int | None = None,
    dict_init: ArrayLike | None = None,
    code_init: ArrayLike | None = None,
    callback: Callable | None = None,
    verbose: bool = False,
    random_state: int | RandomState | None = None,
    return_n_iter: bool = False,
    positive_dict: bool = False,
    positive_code: bool = False,
    method_max_iter: int = 1000
) -> tuple[ArrayLike, ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, ArrayLike, int]

Solve a dictionary learning matrix factorization problem.

dict_learning_online { .api }

from sklearn.decomposition import dict_learning_online

dict_learning_online(
    X: ArrayLike,
    n_components: int = 2,
    alpha: float = 1,
    max_iter: int = 100,
    return_code: bool = True,
    dict_init: ArrayLike | None = None,
    callback: Callable | None = None,
    batch_size: int = 256,
    verbose: bool = False,
    shuffle: bool = True,
    n_jobs: int | None = None,
    method: str = "lars",
    iter_offset: int = 0,
    random_state: int | RandomState | None = None,
    return_inner_stats: bool = False,
    inner_stats: tuple | None = None,
    return_n_iter: bool = False,
    positive_dict: bool = False,
    positive_code: bool = False,
    method_max_iter: int = 1000
) -> ArrayLike | tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, tuple] | tuple[ArrayLike, ArrayLike, tuple] | tuple[ArrayLike, int] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, tuple, int] | tuple[ArrayLike, ArrayLike, tuple, int]

Solve a dictionary learning matrix factorization problem online.

sparse_encode { .api }

from sklearn.decomposition import sparse_encode

sparse_encode(
    X: ArrayLike,
    dictionary: ArrayLike,
    gram: ArrayLike | None = None,
    cov: ArrayLike | None = None,
    algorithm: str = "lasso_lars",
    n_nonzero_coefs: int | None = None,
    alpha: float | None = None,
    copy_cov: bool = True,
    init: ArrayLike | None = None,
    max_iter: int = 1000,
    n_jobs: int | None = None,
    check_input: bool = True,
    verbose: int = 0,
    positive: bool = False
) -> ArrayLike

Sparse coding.

non_negative_factorization { .api }

from sklearn.decomposition import non_negative_factorization

non_negative_factorization(
    X: ArrayLike,
    W: ArrayLike | None = None,
    H: ArrayLike | None = None,
    n_components: int | None = None,
    init: str | ArrayLike | None = None,
    update_H: bool = True,
    solver: str = "cd",
    beta_loss: float | str = "frobenius",
    tol: float = 0.0001,
    max_iter: int = 200,
    alpha_W: float = 0.0,
    alpha_H: float | str = "same",
    l1_ratio: float = 0.0,
    regularization: str | None = None,
    random_state: int | RandomState | None = None,
    verbose: int = 0,
    shuffle: bool = False
) -> tuple[ArrayLike, ArrayLike, int]

Compute Non-negative Matrix Factorization (NMF).

Manifold Learning

Isomap { .api }

from sklearn.manifold import Isomap

Isomap(
    n_neighbors: int = 5,
    radius: float | None = None,
    n_components: int = 2,
    eigen_solver: str = "auto",
    tol: float = 0,
    max_iter: int | None = None,
    path_method: str = "auto",
    neighbors_algorithm: str = "auto",
    n_jobs: int | None = None,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None
)

Isomap Embedding.

LocallyLinearEmbedding { .api }

from sklearn.manifold import LocallyLinearEmbedding

LocallyLinearEmbedding(
    n_neighbors: int = 5,
    n_components: int = 2,
    reg: float = 0.001,
    eigen_solver: str = "auto",
    tol: float = 1e-06,
    max_iter: int = 100,
    method: str = "standard",
    hessian_tol: float = 0.0001,
    modified_tol: float = 1e-12,
    neighbors_algorithm: str = "auto",
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
)

Locally Linear Embedding.

MDS { .api }

from sklearn.manifold import MDS

MDS(
    n_components: int = 2,
    metric: bool = True,
    n_init: int = 4,
    max_iter: int = 300,
    verbose: int = 0,
    eps: float = 0.001,
    n_jobs: int | None = None,
    random_state: int | RandomState | None = None,
    dissimilarity: str = "euclidean",
    normalized_stress: str | bool = "auto"
)

Multidimensional scaling.

SpectralEmbedding { .api }

from sklearn.manifold import SpectralEmbedding

SpectralEmbedding(
    n_components: int = 2,
    affinity: str | Callable = "nearest_neighbors",
    gamma: float | None = None,
    random_state: int | RandomState | None = None,
    eigen_solver: str | None = None,
    n_neighbors: int | None = None,
    n_jobs: int | None = None
)

Spectral embedding for non-linear dimensionality reduction.

TSNE { .api }

from sklearn.manifold import TSNE

TSNE(
    n_components: int = 2,
    perplexity: float = 30.0,
    early_exaggeration: float = 12.0,
    learning_rate: float | str = "warn",
    n_iter: int = 1000,
    n_iter_without_progress: int = 300,
    min_grad_norm: float = 1e-07,
    metric: str | Callable = "euclidean",
    metric_params: dict | None = None,
    init: str | ArrayLike = "warn",
    verbose: int = 0,
    random_state: int | RandomState | None = None,
    method: str = "barnes_hut",
    angle: float = 0.5,
    n_jobs: int | None = None,
    square_distances: str | bool = "deprecated"
)

t-distributed Stochastic Neighbor Embedding.

Manifold Learning Functions

locally_linear_embedding { .api }

from sklearn.manifold import locally_linear_embedding

locally_linear_embedding(
    X: ArrayLike,
    n_neighbors: int,
    n_components: int,
    reg: float = 0.001,
    eigen_solver: str = "auto",
    tol: float = 1e-06,
    max_iter: int = 100,
    method: str = "standard",
    hessian_tol: float = 0.0001,
    modified_tol: float = 1e-12,
    random_state: int | RandomState | None = None,
    n_jobs: int | None = None
) -> tuple[ArrayLike, float]

Perform a Locally Linear Embedding analysis on the data.

spectral_embedding { .api }

from sklearn.manifold import spectral_embedding

spectral_embedding(
    adjacency: ArrayLike,
    n_components: int = 8,
    eigen_solver: str | None = None,
    random_state: int | RandomState | None = None,
    eigen_tol: float | str = "auto",
    norm_laplacian: bool = True,
    drop_first: bool = True
) -> ArrayLike

Project the sample on the first eigenvectors of the graph Laplacian.

smacof { .api }

from sklearn.manifold import smacof

smacof(
    dissimilarities: ArrayLike,
    metric: bool = True,
    n_components: int = 2,
    init: ArrayLike | None = None,
    n_init: int = 8,
    n_jobs: int | None = None,
    max_iter: int = 300,
    verbose: int = 0,
    eps: float = 0.001,
    random_state: int | RandomState | None = None,
    return_n_iter: bool = False,
    normalized_stress: str | bool = "auto"
) -> tuple[ArrayLike, float, int] | tuple[ArrayLike, float]

Compute multidimensional scaling using the SMACOF algorithm.

trustworthiness { .api }

from sklearn.manifold import trustworthiness

trustworthiness(
    X: ArrayLike,
    X_embedded: ArrayLike,
    n_neighbors: int = 5,
    metric: str | Callable = "euclidean"
) -> float

Indicate to what extent the local structure is retained.

Mixture Models

GaussianMixture { .api }

from sklearn.mixture import GaussianMixture

GaussianMixture(
    n_components: int = 1,
    covariance_type: str = "full",
    tol: float = 0.001,
    reg_covar: float = 1e-06,
    max_iter: int = 100,
    n_init: int = 1,
    init_params: str = "kmeans",
    weights_init: ArrayLike | None = None,
    means_init: ArrayLike | None = None,
    precisions_init: ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    warm_start: bool = False,
    verbose: int = 0,
    verbose_interval: int = 10
)

Gaussian Mixture Model.

BayesianGaussianMixture { .api }

from sklearn.mixture import BayesianGaussianMixture

BayesianGaussianMixture(
    n_components: int = 1,
    covariance_type: str = "full",
    tol: float = 0.001,
    reg_covar: float = 1e-06,
    max_iter: int = 100,
    n_init: int = 1,
    init_params: str = "kmeans",
    weight_concentration_prior_type: str = "dirichlet_process",
    weight_concentration_prior: float | None = None,
    mean_precision_prior: float | None = None,
    mean_prior: ArrayLike | None = None,
    degrees_of_freedom_prior: float | None = None,
    covariance_prior: float | ArrayLike | None = None,
    random_state: int | RandomState | None = None,
    warm_start: bool = False,
    verbose: int = 0,
    verbose_interval: int = 10
)

Variational Bayesian estimation of a Gaussian mixture.

Covariance Estimation

EmpiricalCovariance { .api }

from sklearn.covariance import EmpiricalCovariance

EmpiricalCovariance(
    store_precision: bool = True,
    assume_centered: bool = False
)

Maximum likelihood covariance estimator.

ShrunkCovariance { .api }

from sklearn.covariance import ShrunkCovariance

ShrunkCovariance(
    store_precision: bool = True,
    assume_centered: bool = False,
    shrinkage: float = 0.1
)

Covariance estimator with shrinkage.

LedoitWolf { .api }

from sklearn.covariance import LedoitWolf

LedoitWolf(
    store_precision: bool = True,
    assume_centered: bool = False,
    block_size: int = 1000
)

LedoitWolf Estimator.

OAS { .api }

from sklearn.covariance import OAS

OAS(
    store_precision: bool = True,
    assume_centered: bool = False
)

Oracle Approximating Shrinkage Estimator.

MinCovDet { .api }

from sklearn.covariance import MinCovDet

MinCovDet(
    store_precision: bool = True,
    assume_centered: bool = False,
    support_fraction: float | None = None,
    random_state: int | RandomState | None = None
)

Minimum Covariance Determinant (Robust covariance estimation).

GraphicalLasso { .api }

from sklearn.covariance import GraphicalLasso

GraphicalLasso(
    alpha: float = 0.01,
    mode: str = "cd",
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    verbose: bool = False,
    assume_centered: bool = False
)

Sparse inverse covariance estimation with an l1-penalized estimator.

GraphicalLassoCV { .api }

from sklearn.covariance import GraphicalLassoCV

GraphicalLassoCV(
    alphas: int | ArrayLike = 4,
    n_refinements: int = 4,
    cv: int | BaseCrossValidator | Iterable | None = None,
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    mode: str = "cd",
    n_jobs: int | None = None,
    verbose: bool = False,
    assume_centered: bool = False
)

Sparse inverse covariance w/ cross-validated choice of the l1 penalty.

EllipticEnvelope { .api }

from sklearn.covariance import EllipticEnvelope

EllipticEnvelope(
    store_precision: bool = True,
    assume_centered: bool = False,
    support_fraction: float | None = None,
    contamination: float = 0.1,
    random_state: int | RandomState | None = None
)

An object for detecting outliers in a Gaussian distributed dataset.

Covariance Functions

empirical_covariance { .api }

from sklearn.covariance import empirical_covariance

empirical_covariance(
    X: ArrayLike,
    assume_centered: bool = False
) -> ArrayLike

Compute the Maximum likelihood covariance estimator.

shrunk_covariance { .api }

from sklearn.covariance import shrunk_covariance

shrunk_covariance(
    emp_cov: ArrayLike,
    shrinkage: float = 0.1
) -> ArrayLike

Calculate a covariance matrix shrunk on the diagonal.

ledoit_wolf { .api }

from sklearn.covariance import ledoit_wolf

ledoit_wolf(
    X: ArrayLike,
    assume_centered: bool = False,
    block_size: int = 1000
) -> tuple[ArrayLike, float]

Estimate covariance with the Ledoit-Wolf estimator.

ledoit_wolf_shrinkage { .api }

from sklearn.covariance import ledoit_wolf_shrinkage

ledoit_wolf_shrinkage(
    X: ArrayLike,
    assume_centered: bool = False,
    block_size: int = 1000
) -> float

Calculate the Ledoit-Wolf shrinkage coefficient.

oas { .api }

from sklearn.covariance import oas

oas(
    X: ArrayLike,
    assume_centered: bool = False
) -> tuple[ArrayLike, float]

Estimate covariance with the Oracle Approximating Shrinkage algorithm.

fast_mcd { .api }

from sklearn.covariance import fast_mcd

fast_mcd(
    X: ArrayLike,
    support_fraction: float | None = None,
    cov_computation_method: Callable = ...,
    random_state: int | RandomState | None = None
) -> tuple[ArrayLike, ArrayLike, ArrayLike, ArrayLike]

Estimates the Minimum Covariance Determinant matrix.

graphical_lasso { .api }

from sklearn.covariance import graphical_lasso

graphical_lasso(
    emp_cov: ArrayLike,
    alpha: float,
    cov_init: ArrayLike | None = None,
    mode: str = "cd",
    tol: float = 0.0001,
    enet_tol: float = 0.0001,
    max_iter: int = 100,
    verbose: bool = False,
    return_costs: bool = False,
    eps: float = ...,
    return_n_iter: bool = False
) -> tuple[ArrayLike, ArrayLike] | tuple[ArrayLike, ArrayLike, list] | tuple[ArrayLike, ArrayLike, int] | tuple[ArrayLike, ArrayLike, list, int]

L1-penalized covariance estimator.

log_likelihood { .api }

from sklearn.covariance import log_likelihood

log_likelihood(
    emp_cov: ArrayLike,
    precision: ArrayLike
) -> float

Compute the sample mean of the log_likelihood under a covariance model.

Cross Decomposition

CCA { .api }

from sklearn.cross_decomposition import CCA

CCA(
    n_components: int = 2,
    scale: bool = True,
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

Canonical Correlation Analysis.

PLSCanonical { .api }

from sklearn.cross_decomposition import PLSCanonical

PLSCanonical(
    n_components: int = 2,
    scale: bool = True,
    algorithm: str = "nipals",
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

Partial Least Squares transformer and regressor.

PLSRegression { .api }

from sklearn.cross_decomposition import PLSRegression

PLSRegression(
    n_components: int = 2,
    scale: bool = True,
    max_iter: int = 500,
    tol: float = 1e-06,
    copy: bool = True
)

PLS regression.

PLSSVD { .api }

from sklearn.cross_decomposition import PLSSVD

PLSSVD(
    n_components: int = 2,
    scale: bool = True,
    copy: bool = True
)

Partial Least Square SVD.

Outlier Detection

Outlier detection algorithms are also available in the ensemble module:

LocalOutlierFactor { .api }

from sklearn.neighbors import LocalOutlierFactor

LocalOutlierFactor(
    n_neighbors: int = 20,
    algorithm: str = "auto",
    leaf_size: int = 30,
    metric: str | Callable = "minkowski",
    p: int = 2,
    metric_params: dict | None = None,
    contamination: float | str = "auto",
    novelty: bool = False,
    n_jobs: int | None = None
)

Unsupervised Outlier Detection using Local Outlier Factor (LOF).

Note: Additional outlier detection methods are available in:

sklearn.ensemble.IsolationForest - Isolation Forest Algorithm
sklearn.svm.OneClassSVM - One-Class Support Vector Machine
sklearn.covariance.EllipticEnvelope - Outlier detection for Gaussian data

Install with Tessl CLI

npx tessl i tessl/pypi-scikit-learn

tessl/pypi-scikit-learn