CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-scikit-learn

A comprehensive machine learning library providing supervised and unsupervised learning algorithms with consistent APIs and extensive tools for data preprocessing, model evaluation, and deployment.

87

0.98x
Overview
Eval results
Files

task.mdevals/scenario-5/

Feature Selection Utility

Utilities for ranking and selecting informative features for binary classification using filter scoring, model importances, and recursive elimination.

Capabilities

Filter ranks informative columns

  • With feature matrix [[5, 1, 0, 0], [4, 1, 0, 0], [5, 0, 0, 1], [4, 0, 0, 1], [0, 5, 1, 0], [0, 4, 1, 0], [0, 5, 0, 1], [0, 4, 0, 1]], labels [1, 1, 1, 1, 0, 0, 0, 0], feature names ["signal_primary", "signal_secondary", "noise_a", "noise_b"], and k=2, the returned names are ["signal_primary", "signal_secondary"] in that order. @test
  • Requesting k=10 with the same inputs returns all four feature names sorted by relevance without raising an error. @test

Model-based pruning respects importance threshold

  • Given feature matrix [[9, 2, 0, 0], [8, 2, 0, 0], [10, 2, 0, 0], [0, 1, 9, 2], [0, 1, 8, 2], [0, 1, 10, 2]], labels [1, 1, 1, 0, 0, 0], feature names ["strong_left", "weak_helper", "strong_right", "filler"], and top_fraction=0.5, the selector keeps exactly ["strong_left", "strong_right"] (order by learned importance, highest first). @test

Recursive elimination keeps strongest pair

  • Using feature matrix [[2, 0, 7], [2, 0, 6], [0, 3, 0], [0, 3, 1]], labels [1, 1, 0, 0], feature names ["left", "right", "noise"], and keep=2, recursive elimination returns ["left", "right"] ordered from strongest to weakest retained feature. @test

Implementation

@generates

API

from typing import Sequence, List
import numpy as np

def rank_by_filter(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], k: int) -> List[str]:
    """Return the top-k feature names ranked by a univariate mutual-information filter in descending order; cap k at the available feature count."""

def select_by_importance(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], top_fraction: float) -> List[str]:
    """Fit a supervised model that exposes feature importances or weights, then keep the highest-importance fraction of feature names; fraction is (0,1] and rounds up to at least one feature."""

def rfe_select(X: np.ndarray, y: Sequence[int], feature_names: Sequence[str], keep: int) -> List[str]:
    """Perform recursive feature elimination backed by a supervised estimator and return the retained feature names ordered best-to-worst, keeping the requested number when available."""

Dependencies { .dependencies }

scikit-learn { .dependency }

Provides feature scoring filters, model-based feature selectors, mutual-information metrics, and recursive elimination utilities for supervised data.

Install with Tessl CLI

npx tessl i tessl/pypi-scikit-learn

tile.json