or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/scikit-learn@1.7.x
tile.json

tessl/pypi-scikit-learn

tessl install tessl/pypi-scikit-learn@1.7.0

A comprehensive machine learning library providing supervised and unsupervised learning algorithms with consistent APIs and extensive tools for data preprocessing, model evaluation, and deployment.

Agent Success

Agent success rate when using this tile

87%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.99x

Baseline

Agent success rate without this tile

88%

task.mdevals/scenario-3/

Feature Selection Utility

Utilities for ranking and selecting informative features for binary classification using filter scoring, model importances, and recursive elimination.

Capabilities

Filter ranks informative columns

  • With feature matrix [[5, 1, 0, 0], [4, 1, 0, 0], [5, 0, 0, 1], [4, 0, 0, 1], [0, 5, 1, 0], [0, 4, 1, 0], [0, 5, 0, 1], [0, 4, 0, 1]], labels [1, 1, 1, 1, 0, 0, 0, 0], feature names ["signal_primary", "signal_secondary", "noise_a", "noise_b"], and k=2, the returned names are ["signal_primary", "signal_secondary"] in that order. @test
  • Requesting k=10 with the same inputs returns all four feature names sorted by relevance without raising an error. @test

Model-based pruning respects importance threshold

  • Given feature matrix [[9, 2, 0, 0], [8, 2, 0, 0], [10, 2, 0, 0], [0, 1, 9, 2], [0, 1, 8, 2], [0, 1, 10, 2]], labels [1, 1, 1, 0, 0, 0], feature names ["strong_left", "weak_helper", "strong_right", "filler"], and top_fraction=0.5, the selector keeps exactly ["strong_left", "strong_right"] (order by learned importance, highest first). @test

Recursive elimination keeps strongest pair

  • Using feature matrix [[2, 0, 7], [2, 0, 6], [0, 3, 0], [0, 3, 1]], labels [1, 1, 0, 0], feature names ["left", "right", "noise"], and keep=2, recursive elimination returns ["left", "right"] ordered from strongest to weakest retained feature. @test

Implementation

@generates

API

from typing import Sequence, List
import numpy as np

def rank_by_filter(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], k: int) -> List[str]:
    """Return the top-k feature names ranked by a univariate mutual-information filter in descending order; cap k at the available feature count."""

def select_by_importance(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], top_fraction: float) -> List[str]:
    """Fit a supervised model that exposes feature importances or weights, then keep the highest-importance fraction of feature names; fraction is (0,1] and rounds up to at least one feature."""

def rfe_select(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], keep: int) -> List[str]:
    """Perform recursive feature elimination backed by a supervised estimator and return the retained feature names ordered best-to-worst, keeping the requested number when available."""

Dependencies { .dependencies }

scikit-learn { .dependency }

Provides feature scoring filters, model-based feature selectors, mutual-information metrics, and recursive elimination utilities for supervised data.