tessl/pypi-scikit-learn-intelex

Intel Extension for Scikit-learn providing hardware-accelerated implementations of scikit-learn algorithms optimized for Intel CPUs and GPUs.

—

Pending

Overview

Eval results

Files

Linear Models

Name: tessl/pypi-scikit-learn-intelex
Author: tessl

Accelerated linear regression, logistic regression, and regularized models with Intel optimization. These implementations provide significant performance improvements for large datasets through optimized matrix operations and Intel hardware acceleration.

Capabilities

Linear Regression

Intel-accelerated ordinary least squares linear regression with optimized matrix decomposition.

class LinearRegression:
    """
    Ordinary least squares Linear Regression with Intel optimization.
    
    Provides significant speedups through optimized BLAS operations
    and Intel Math Kernel Library (MKL) integration.
    """
    
    def __init__(
        self,
        fit_intercept=True,
        copy_X=True,
        n_jobs=None,
        positive=False
    ):
        """
        Initialize Linear Regression.
        
        Parameters:
            fit_intercept (bool): Whether to calculate intercept
            copy_X (bool): Whether to copy X or overwrite
            n_jobs (int): Number of jobs for computation
            positive (bool): Whether to force positive coefficients
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit linear model.
        
        Parameters:
            X (array-like): Training data of shape (n_samples, n_features)
            y (array-like): Target values
            sample_weight (array-like): Individual sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Predict using the linear model.
        
        Parameters:
            X (array-like): Input data
            
        Returns:
            array: Predicted values
        """
    
    def score(self, X, y, sample_weight=None):
        """
        Return coefficient of determination R² of prediction.
        
        Parameters:
            X (array-like): Test samples
            y (array-like): True values
            sample_weight (array-like): Sample weights
            
        Returns:
            float: R² score
        """
    
    # Attributes available after fitting
    coef_: ...         # Estimated coefficients
    intercept_: ...    # Independent term
    n_features_in_: ...  # Number of features during fit
    feature_names_in_: ... # Feature names during fit

Logistic Regression

Intel-optimized logistic regression for classification with accelerated solver algorithms.

class LogisticRegression:
    """
    Logistic Regression classifier with Intel optimization.
    
    Uses optimized solvers and Intel MKL for faster convergence
    and improved performance on large datasets.
    """
    
    def __init__(
        self,
        penalty='l2',
        dual=False,
        tol=1e-4,
        C=1.0,
        fit_intercept=True,
        intercept_scaling=1,
        class_weight=None,
        random_state=None,
        solver='lbfgs',
        max_iter=100,
        multi_class='auto',
        verbose=0,
        warm_start=False,
        n_jobs=None,
        l1_ratio=None
    ):
        """
        Initialize Logistic Regression.
        
        Parameters:
            penalty (str): Regularization penalty ('l1', 'l2', 'elasticnet', 'none')
            dual (bool): Dual or primal formulation
            tol (float): Tolerance for stopping criteria
            C (float): Inverse of regularization strength
            fit_intercept (bool): Whether to fit intercept
            intercept_scaling (float): Scaling for intercept
            class_weight (dict): Weights for classes
            random_state (int): Random state for reproducibility
            solver (str): Algorithm for optimization
            max_iter (int): Maximum iterations
            multi_class (str): Multi-class strategy
            verbose (int): Verbosity level
            warm_start (bool): Whether to reuse previous solution
            n_jobs (int): Number of parallel jobs
            l1_ratio (float): ElasticNet mixing parameter
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit the logistic regression model.
        
        Parameters:
            X (array-like): Training data
            y (array-like): Target values
            sample_weight (array-like): Sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Predict class labels.
        
        Parameters:
            X (array-like): Input data
            
        Returns:
            array: Predicted class labels
        """
    
    def predict_proba(self, X):
        """
        Predict class probabilities.
        
        Parameters:
            X (array-like): Input data
            
        Returns:
            array: Class probabilities
        """
    
    def predict_log_proba(self, X):
        """
        Predict logarithm of class probabilities.
        
        Parameters:
            X (array-like): Input data
            
        Returns:
            array: Log probabilities
        """
    
    def decision_function(self, X):
        """
        Predict confidence scores.
        
        Parameters:
            X (array-like): Input data
            
        Returns:
            array: Confidence scores
        """
    
    # Attributes available after fitting
    coef_: ...           # Coefficients
    intercept_: ...      # Intercept
    classes_: ...        # Class labels
    n_iter_: ...         # Number of iterations

Ridge Regression

L2-regularized linear regression with Intel-optimized solvers.

class Ridge:
    """
    Ridge regression with Intel optimization.
    
    Linear least squares with L2 regularization, using optimized
    solvers for improved performance on large datasets.
    """
    
    def __init__(
        self,
        alpha=1.0,
        fit_intercept=True,
        copy_X=True,
        max_iter=None,
        tol=1e-4,
        solver='auto',
        positive=False,
        random_state=None
    ):
        """
        Initialize Ridge regression.
        
        Parameters:
            alpha (float): Regularization strength
            fit_intercept (bool): Whether to fit intercept
            copy_X (bool): Whether to copy X
            max_iter (int): Maximum iterations
            tol (float): Tolerance for convergence
            solver (str): Solver algorithm
            positive (bool): Force positive coefficients
            random_state (int): Random state
        """
    
    def fit(self, X, y, sample_weight=None):
        """Fit Ridge regression model."""
    
    def predict(self, X):
        """Predict using Ridge regression."""
    
    def score(self, X, y, sample_weight=None):
        """Return R² score."""
    
    # Attributes
    coef_: ...
    intercept_: ...

Lasso Regression

L1-regularized linear regression with Intel-optimized coordinate descent.

class Lasso:
    """
    Lasso regression with Intel optimization.
    
    Linear regression with L1 regularization using optimized
    coordinate descent algorithm.
    """
    
    def __init__(
        self,
        alpha=1.0,
        fit_intercept=True,
        precompute=False,
        copy_X=True,
        max_iter=1000,
        tol=1e-4,
        warm_start=False,
        positive=False,
        random_state=None,
        selection='cyclic'
    ):
        """
        Initialize Lasso regression.
        
        Parameters:
            alpha (float): Regularization strength
            fit_intercept (bool): Whether to fit intercept
            precompute (bool): Whether to use precomputed Gram matrix
            copy_X (bool): Whether to copy X
            max_iter (int): Maximum iterations
            tol (float): Tolerance for convergence
            warm_start (bool): Reuse previous solution
            positive (bool): Force positive coefficients
            random_state (int): Random state
            selection (str): Feature selection strategy
        """
    
    def fit(self, X, y, sample_weight=None, check_input=True):
        """Fit Lasso regression model."""
    
    def predict(self, X):
        """Predict using Lasso regression."""
    
    # Attributes
    coef_: ...
    intercept_: ...
    n_iter_: ...

Elastic Net Regression

Combined L1 and L2 regularization with Intel optimization.

class ElasticNet:
    """
    Elastic Net regression with Intel optimization.
    
    Linear regression with combined L1 and L2 regularization,
    using optimized coordinate descent solver.
    """
    
    def __init__(
        self,
        alpha=1.0,
        l1_ratio=0.5,
        fit_intercept=True,
        precompute=False,
        max_iter=1000,
        copy_X=True,
        tol=1e-4,
        warm_start=False,
        positive=False,
        random_state=None,
        selection='cyclic'
    ):
        """
        Initialize Elastic Net regression.
        
        Parameters:
            alpha (float): Regularization strength
            l1_ratio (float): Mix ratio of L1 and L2 penalties
            fit_intercept (bool): Whether to fit intercept
            precompute (bool): Whether to use precomputed Gram matrix
            max_iter (int): Maximum iterations
            copy_X (bool): Whether to copy X
            tol (float): Tolerance for convergence
            warm_start (bool): Reuse previous solution
            positive (bool): Force positive coefficients
            random_state (int): Random state
            selection (str): Feature selection strategy
        """
    
    def fit(self, X, y, sample_weight=None, check_input=True):
        """Fit Elastic Net regression model."""
    
    def predict(self, X):
        """Predict using Elastic Net regression."""
    
    # Attributes
    coef_: ...
    intercept_: ...
    n_iter_: ...

Incremental Linear Regression

Memory-efficient linear regression for streaming data with Intel optimization.

class IncrementalLinearRegression:
    """
    Incremental Linear Regression with Intel optimization.
    
    Allows fitting linear regression incrementally on mini-batches
    of data, useful for large datasets that don't fit in memory.
    """
    
    def __init__(self, fit_intercept=True, copy_X=True):
        """
        Initialize Incremental Linear Regression.
        
        Parameters:
            fit_intercept (bool): Whether to fit intercept
            copy_X (bool): Whether to copy input data
        """
    
    def partial_fit(self, X, y, sample_weight=None):
        """
        Incrementally fit linear regression.
        
        Parameters:
            X (array-like): Training data batch
            y (array-like): Target values batch
            sample_weight (array-like): Sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit linear regression (clears previous fits).
        
        Parameters:
            X (array-like): Training data
            y (array-like): Target values
            sample_weight (array-like): Sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """Predict using the linear model."""
    
    def score(self, X, y, sample_weight=None):
        """Return R² score."""
    
    # Attributes
    coef_: ...
    intercept_: ...

Usage Examples

Basic Linear Regression

import numpy as np
from sklearnex.linear_model import LinearRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# Generate sample data
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
lr = LinearRegression()
lr.fit(X_train, y_train)

# Make predictions
y_pred = lr.predict(X_test)
r2_score = lr.score(X_test, y_test)

print(f"R² Score: {r2_score:.3f}")
print(f"Coefficients shape: {lr.coef_.shape}")
print(f"Intercept: {lr.intercept_:.3f}")

Logistic Regression Classification

import numpy as np
from sklearnex.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate classification data
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit logistic regression
lr = LogisticRegression(random_state=42, max_iter=1000)
lr.fit(X_train, y_train)

# Predictions and probabilities
y_pred = lr.predict(X_test)
y_proba = lr.predict_proba(X_test)
accuracy = lr.score(X_test, y_test)

print(f"Accuracy: {accuracy:.3f}")
print(f"Classes: {lr.classes_}")
print(f"Coefficients shape: {lr.coef_.shape}")
print(f"Probabilities shape: {y_proba.shape}")

Regularized Regression Comparison

import numpy as np
from sklearnex.linear_model import Ridge, Lasso, ElasticNet
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

# Generate data with some noise
X, y = make_regression(n_samples=1000, n_features=50, noise=10, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Compare different regularization methods
models = {
    'Ridge': Ridge(alpha=1.0),
    'Lasso': Lasso(alpha=1.0, max_iter=2000),
    'ElasticNet': ElasticNet(alpha=1.0, l1_ratio=0.5, max_iter=2000)
}

for name, model in models.items():
    model.fit(X_train, y_train)
    score = model.score(X_test, y_test)
    n_nonzero = np.sum(np.abs(model.coef_) > 1e-6)
    
    print(f"{name:12s} - R²: {score:.3f}, Non-zero coefs: {n_nonzero}")

Incremental Learning

import numpy as np
from sklearnex.linear_model import IncrementalLinearRegression
from sklearn.datasets import make_regression

# Generate large dataset
X, y = make_regression(n_samples=10000, n_features=20, noise=0.1, random_state=42)

# Create incremental model
inc_lr = IncrementalLinearRegression()

# Fit in batches
batch_size = 1000
n_batches = len(X) // batch_size

for i in range(n_batches):
    start_idx = i * batch_size
    end_idx = (i + 1) * batch_size
    
    X_batch = X[start_idx:end_idx]
    y_batch = y[start_idx:end_idx]
    
    inc_lr.partial_fit(X_batch, y_batch)
    
    # Evaluate on current batch
    score = inc_lr.score(X_batch, y_batch)
    print(f"Batch {i+1}/{n_batches} - R²: {score:.3f}")

# Final evaluation on full dataset
final_score = inc_lr.score(X, y)
print(f"Final R² score: {final_score:.3f}")

Performance Notes

Linear regression shows significant speedups on datasets with >1000 samples
Logistic regression benefits most from Intel optimization with large feature spaces
Regularized models (Ridge, Lasso, ElasticNet) have optimized coordinate descent
Incremental learning maintains performance while reducing memory usage
All models maintain numerical stability equivalent to scikit-learn

Install with Tessl CLI

npx tessl i tessl/pypi-scikit-learn-intelex

docs

metrics-model-selection.md