tessl/pypi-scikit-learn-intelex

Intel Extension for Scikit-learn providing hardware-accelerated implementations of scikit-learn algorithms optimized for Intel CPUs and GPUs.

—

Pending

Overview

Eval results

Files

Support Vector Machines

Name: tessl/pypi-scikit-learn-intelex
Author: tessl

High-performance implementations of Support Vector Machine algorithms with Intel hardware acceleration. These algorithms provide significant speedups for classification and regression tasks using optimized kernel computations and solver algorithms.

Capabilities

Support Vector Classifier (SVC)

Intel-accelerated Support Vector Classifier with optimized kernel computations and SMO solver.

class SVC:
    """
    Support Vector Classifier with Intel optimization.
    
    Provides significant speedup over standard scikit-learn implementation
    through optimized kernel computations and accelerated SMO solver.
    """
    
    def __init__(
        self,
        C=1.0,
        kernel='rbf',
        degree=3,
        gamma='scale',
        coef0=0.0,
        shrinking=True,
        probability=False,
        tol=1e-3,
        cache_size=200,
        class_weight=None,
        verbose=False,
        max_iter=-1,
        decision_function_shape='ovr',
        break_ties=False,
        random_state=None
    ):
        """
        Initialize Support Vector Classifier.
        
        Parameters:
            C (float): Regularization parameter
            kernel (str): Kernel type ('linear', 'poly', 'rbf', 'sigmoid', 'precomputed')
            degree (int): Degree for poly kernel
            gamma (str or float): Kernel coefficient ('scale', 'auto', or float)
            coef0 (float): Independent term for poly/sigmoid kernels
            shrinking (bool): Whether to use shrinking heuristic
            probability (bool): Whether to enable probability estimates
            tol (float): Tolerance for stopping criterion
            cache_size (float): Size of kernel cache (MB)
            class_weight (dict or str): Class weights ('balanced' or dict)
            verbose (bool): Enable verbose output
            max_iter (int): Hard limit on iterations (-1 for no limit)
            decision_function_shape (str): Shape of decision function ('ovo', 'ovr')
            break_ties (bool): Break ties according to confidence values
            random_state (int): Random state for reproducibility
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit the SVM model according to training data.
        
        Parameters:
            X (array-like): Training vectors of shape (n_samples, n_features)
            y (array-like): Target values of shape (n_samples,)
            sample_weight (array-like): Per-sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Perform classification on samples in X.
        
        Parameters:
            X (array-like): Samples of shape (n_samples, n_features)
            
        Returns:
            array: Class labels for samples
        """
    
    def predict_proba(self, X):
        """
        Compute probabilities of possible outcomes for samples in X.
        
        Only available when probability=True.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Probability estimates of shape (n_samples, n_classes)
        """
    
    def predict_log_proba(self, X):
        """
        Compute log probabilities of possible outcomes for samples in X.
        
        Only available when probability=True.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Log probability estimates
        """
    
    def decision_function(self, X):
        """
        Evaluate decision function for samples in X.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Decision function values
        """
    
    def score(self, X, y, sample_weight=None):
        """
        Return mean accuracy on given test data and labels.
        
        Parameters:
            X (array-like): Test samples
            y (array-like): True labels
            sample_weight (array-like): Sample weights
            
        Returns:
            float: Mean accuracy score
        """
    
    # Attributes available after fitting
    support_: ...           # Indices of support vectors
    support_vectors_: ...   # Support vectors
    n_support_: ...         # Number of support vectors for each class
    dual_coef_: ...         # Coefficients of support vectors in decision function
    coef_: ...              # Weights assigned to features (linear kernel only)
    intercept_: ...         # Constants in decision function
    classes_: ...           # Class labels
    gamma_: ...             # Current gamma value
    class_weight_: ...      # Weights assigned to classes
    shape_fit_: ...         # Array dimensions of training vector X
    n_features_in_: ...     # Number of features seen during fit

Support Vector Regressor (SVR)

Intel-accelerated Support Vector Regressor for continuous target prediction.

class SVR:
    """
    Support Vector Regressor with Intel optimization.
    
    Efficient regression using support vector machines with optimized
    kernel computations and accelerated solver algorithms.
    """
    
    def __init__(
        self,
        kernel='rbf',
        degree=3,
        gamma='scale',
        coef0=0.0,
        tol=1e-3,
        C=1.0,
        epsilon=0.1,
        shrinking=True,
        cache_size=200,
        verbose=False,
        max_iter=-1
    ):
        """
        Initialize Support Vector Regressor.
        
        Parameters:
            kernel (str): Kernel type ('linear', 'poly', 'rbf', 'sigmoid', 'precomputed')
            degree (int): Degree for poly kernel
            gamma (str or float): Kernel coefficient ('scale', 'auto', or float)
            coef0 (float): Independent term for poly/sigmoid kernels
            tol (float): Tolerance for stopping criterion
            C (float): Regularization parameter
            epsilon (float): Epsilon parameter in epsilon-SVR model
            shrinking (bool): Whether to use shrinking heuristic
            cache_size (float): Size of kernel cache (MB)
            verbose (bool): Enable verbose output
            max_iter (int): Hard limit on iterations (-1 for no limit)
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit the SVM model according to training data.
        
        Parameters:
            X (array-like): Training vectors of shape (n_samples, n_features)
            y (array-like): Target values of shape (n_samples,)
            sample_weight (array-like): Per-sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Perform regression on samples in X.
        
        Parameters:
            X (array-like): Samples of shape (n_samples, n_features)
            
        Returns:
            array: Predicted values for samples
        """
    
    def score(self, X, y, sample_weight=None):
        """
        Return coefficient of determination R^2 of prediction.
        
        Parameters:
            X (array-like): Test samples
            y (array-like): True values
            sample_weight (array-like): Sample weights
            
        Returns:
            float: R^2 score
        """
    
    # Attributes available after fitting
    support_: ...           # Indices of support vectors
    support_vectors_: ...   # Support vectors
    dual_coef_: ...         # Coefficients of support vectors in decision function
    coef_: ...              # Weights assigned to features (linear kernel only)
    intercept_: ...         # Constants in decision function
    gamma_: ...             # Current gamma value
    shape_fit_: ...         # Array dimensions of training vector X
    n_features_in_: ...     # Number of features seen during fit

Nu Support Vector Classifier (NuSVC)

Intel-accelerated Nu Support Vector Classifier using nu-SVM formulation.

class NuSVC:
    """
    Nu Support Vector Classifier with Intel optimization.
    
    Similar to SVC but uses nu parameter instead of C for controlling
    the number of support vectors and training errors.
    """
    
    def __init__(
        self,
        nu=0.5,
        kernel='rbf',
        degree=3,
        gamma='scale',
        coef0=0.0,
        shrinking=True,
        probability=False,
        tol=1e-3,
        cache_size=200,
        class_weight=None,
        verbose=False,
        max_iter=-1,
        decision_function_shape='ovr',
        break_ties=False,
        random_state=None
    ):
        """
        Initialize Nu Support Vector Classifier.
        
        Parameters:
            nu (float): Upper bound on fraction of margin errors (0 < nu <= 1)
            kernel (str): Kernel type ('linear', 'poly', 'rbf', 'sigmoid', 'precomputed')
            degree (int): Degree for poly kernel
            gamma (str or float): Kernel coefficient ('scale', 'auto', or float)
            coef0 (float): Independent term for poly/sigmoid kernels
            shrinking (bool): Whether to use shrinking heuristic
            probability (bool): Whether to enable probability estimates
            tol (float): Tolerance for stopping criterion
            cache_size (float): Size of kernel cache (MB)
            class_weight (dict or str): Class weights ('balanced' or dict)
            verbose (bool): Enable verbose output
            max_iter (int): Hard limit on iterations (-1 for no limit)
            decision_function_shape (str): Shape of decision function ('ovo', 'ovr')
            break_ties (bool): Break ties according to confidence values
            random_state (int): Random state for reproducibility
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit the Nu-SVM model according to training data.
        
        Parameters:
            X (array-like): Training vectors of shape (n_samples, n_features)
            y (array-like): Target values of shape (n_samples,)
            sample_weight (array-like): Per-sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Perform classification on samples in X.
        
        Parameters:
            X (array-like): Samples of shape (n_samples, n_features)
            
        Returns:
            array: Class labels for samples
        """
    
    def predict_proba(self, X):
        """
        Compute probabilities of possible outcomes for samples in X.
        
        Only available when probability=True.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Probability estimates of shape (n_samples, n_classes)
        """
    
    def predict_log_proba(self, X):
        """
        Compute log probabilities of possible outcomes for samples in X.
        
        Only available when probability=True.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Log probability estimates
        """
    
    def decision_function(self, X):
        """
        Evaluate decision function for samples in X.
        
        Parameters:
            X (array-like): Samples
            
        Returns:
            array: Decision function values
        """
    
    def score(self, X, y, sample_weight=None):
        """
        Return mean accuracy on given test data and labels.
        
        Parameters:
            X (array-like): Test samples
            y (array-like): True labels
            sample_weight (array-like): Sample weights
            
        Returns:
            float: Mean accuracy score
        """
    
    # Attributes available after fitting
    support_: ...           # Indices of support vectors
    support_vectors_: ...   # Support vectors
    n_support_: ...         # Number of support vectors for each class
    dual_coef_: ...         # Coefficients of support vectors in decision function
    coef_: ...              # Weights assigned to features (linear kernel only)
    intercept_: ...         # Constants in decision function
    classes_: ...           # Class labels
    gamma_: ...             # Current gamma value
    class_weight_: ...      # Weights assigned to classes
    shape_fit_: ...         # Array dimensions of training vector X
    n_features_in_: ...     # Number of features seen during fit

Nu Support Vector Regressor (NuSVR)

Intel-accelerated Nu Support Vector Regressor using nu-SVM formulation.

class NuSVR:
    """
    Nu Support Vector Regressor with Intel optimization.
    
    Similar to SVR but uses nu parameter to control the number of
    support vectors in the regression model.
    """
    
    def __init__(
        self,
        nu=0.5,
        C=1.0,
        kernel='rbf',
        degree=3,
        gamma='scale',
        coef0=0.0,
        shrinking=True,
        tol=1e-3,
        cache_size=200,
        verbose=False,
        max_iter=-1
    ):
        """
        Initialize Nu Support Vector Regressor.
        
        Parameters:
            nu (float): Upper bound on fraction of training errors (0 < nu <= 1)
            C (float): Regularization parameter
            kernel (str): Kernel type ('linear', 'poly', 'rbf', 'sigmoid', 'precomputed')
            degree (int): Degree for poly kernel
            gamma (str or float): Kernel coefficient ('scale', 'auto', or float)
            coef0 (float): Independent term for poly/sigmoid kernels
            shrinking (bool): Whether to use shrinking heuristic
            tol (float): Tolerance for stopping criterion
            cache_size (float): Size of kernel cache (MB)
            verbose (bool): Enable verbose output
            max_iter (int): Hard limit on iterations (-1 for no limit)
        """
    
    def fit(self, X, y, sample_weight=None):
        """
        Fit the Nu-SVM model according to training data.
        
        Parameters:
            X (array-like): Training vectors of shape (n_samples, n_features)
            y (array-like): Target values of shape (n_samples,)
            sample_weight (array-like): Per-sample weights
            
        Returns:
            self: Fitted estimator
        """
    
    def predict(self, X):
        """
        Perform regression on samples in X.
        
        Parameters:
            X (array-like): Samples of shape (n_samples, n_features)
            
        Returns:
            array: Predicted values for samples
        """
    
    def score(self, X, y, sample_weight=None):
        """
        Return coefficient of determination R^2 of prediction.
        
        Parameters:
            X (array-like): Test samples
            y (array-like): True values
            sample_weight (array-like): Sample weights
            
        Returns:
            float: R^2 score
        """
    
    # Attributes available after fitting
    support_: ...           # Indices of support vectors
    support_vectors_: ...   # Support vectors
    dual_coef_: ...         # Coefficients of support vectors in decision function
    coef_: ...              # Weights assigned to features (linear kernel only)
    intercept_: ...         # Constants in decision function
    gamma_: ...             # Current gamma value
    shape_fit_: ...         # Array dimensions of training vector X
    n_features_in_: ...     # Number of features seen during fit

Usage Examples

Support Vector Classification

import numpy as np
from sklearnex.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Generate sample data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=10,
                          n_redundant=10, n_classes=3, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features for better SVM performance
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create and fit SVC
svc = SVC(kernel='rbf', C=1.0, gamma='scale', probability=True, random_state=42)
svc.fit(X_train_scaled, y_train)

# Make predictions
y_pred = svc.predict(X_test_scaled)
y_proba = svc.predict_proba(X_test_scaled)
decision_scores = svc.decision_function(X_test_scaled)

print(f"Accuracy: {svc.score(X_test_scaled, y_test):.3f}")
print(f"Classes: {svc.classes_}")
print(f"Number of support vectors: {svc.n_support_}")
print(f"Total support vectors: {len(svc.support_)}")
print(f"Decision function shape: {decision_scores.shape}")

# Linear SVM example
svc_linear = SVC(kernel='linear', C=1.0)
svc_linear.fit(X_train_scaled, y_train)

print(f"Linear SVM accuracy: {svc_linear.score(X_test_scaled, y_test):.3f}")
print(f"Linear coefficients shape: {svc_linear.coef_.shape}")
print(f"Intercept: {svc_linear.intercept_}")

Support Vector Regression

import numpy as np
from sklearnex.svm import SVR
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, r2_score

# Generate sample data
X, y = make_regression(n_samples=1000, n_features=10, noise=0.1, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Create and fit SVR
svr = SVR(kernel='rbf', C=100, gamma=0.1, epsilon=0.1)
svr.fit(X_train_scaled, y_train)

# Make predictions
y_pred = svr.predict(X_test_scaled)

print(f"R² Score: {svr.score(X_test_scaled, y_test):.3f}")
print(f"MSE: {mean_squared_error(y_test, y_pred):.3f}")
print(f"Number of support vectors: {len(svr.support_)}")

# Linear SVR example
svr_linear = SVR(kernel='linear', C=1.0, epsilon=0.2)
svr_linear.fit(X_train_scaled, y_train)
y_pred_linear = svr_linear.predict(X_test_scaled)

print(f"Linear SVR R² Score: {svr_linear.score(X_test_scaled, y_test):.3f}")
print(f"Linear coefficients shape: {svr_linear.coef_.shape}")

# Polynomial SVR example
svr_poly = SVR(kernel='poly', degree=3, C=1.0, epsilon=0.1)
svr_poly.fit(X_train_scaled, y_train)
y_pred_poly = svr_poly.predict(X_test_scaled)

print(f"Polynomial SVR R² Score: {svr_poly.score(X_test_scaled, y_test):.3f}")

Nu Support Vector Machines

import numpy as np
from sklearnex.svm import NuSVC, NuSVR
from sklearn.datasets import make_classification, make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Nu-SVC Example
X_class, y_class = make_classification(n_samples=800, n_features=15, n_classes=2, random_state=42)
X_train_c, X_test_c, y_train_c, y_test_c = train_test_split(X_class, y_class, test_size=0.2, random_state=42)

scaler_c = StandardScaler()
X_train_c_scaled = scaler_c.fit_transform(X_train_c)
X_test_c_scaled = scaler_c.transform(X_test_c)

# Nu-SVC with different nu values
for nu in [0.1, 0.3, 0.5, 0.7]:
    nu_svc = NuSVC(nu=nu, kernel='rbf', probability=True, random_state=42)
    nu_svc.fit(X_train_c_scaled, y_train_c)
    
    accuracy = nu_svc.score(X_test_c_scaled, y_test_c)
    n_sv = len(nu_svc.support_)
    sv_fraction = n_sv / len(X_train_c_scaled)
    
    print(f"Nu={nu}: Accuracy={accuracy:.3f}, Support Vectors={n_sv} ({sv_fraction:.1%})")

# Nu-SVR Example  
X_reg, y_reg = make_regression(n_samples=800, n_features=10, noise=0.1, random_state=42)
X_train_r, X_test_r, y_train_r, y_test_r = train_test_split(X_reg, y_reg, test_size=0.2, random_state=42)

scaler_r = StandardScaler()
X_train_r_scaled = scaler_r.fit_transform(X_train_r)
X_test_r_scaled = scaler_r.transform(X_test_r)

# Nu-SVR with different nu values
for nu in [0.1, 0.3, 0.5, 0.7]:
    nu_svr = NuSVR(nu=nu, kernel='rbf', C=100)
    nu_svr.fit(X_train_r_scaled, y_train_r)
    
    r2 = nu_svr.score(X_test_r_scaled, y_test_r)
    n_sv = len(nu_svr.support_)
    sv_fraction = n_sv / len(X_train_r_scaled)
    
    print(f"Nu={nu}: R²={r2:.3f}, Support Vectors={n_sv} ({sv_fraction:.1%})")

Kernel Comparison and Parameter Tuning

import numpy as np
from sklearnex.svm import SVC
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler

# Generate sample data
X, y = make_classification(n_samples=1000, n_features=20, n_informative=15,
                          n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Compare different kernels
kernels = ['linear', 'poly', 'rbf', 'sigmoid']
results = {}

for kernel in kernels:
    if kernel == 'poly':
        svc = SVC(kernel=kernel, degree=3, C=1.0, random_state=42)
    else:
        svc = SVC(kernel=kernel, C=1.0, random_state=42)
    
    svc.fit(X_train_scaled, y_train)
    accuracy = svc.score(X_test_scaled, y_test)
    n_sv = len(svc.support_)
    
    results[kernel] = {'accuracy': accuracy, 'support_vectors': n_sv}
    print(f"{kernel.capitalize()} kernel: Accuracy={accuracy:.3f}, SVs={n_sv}")

# Grid search for best parameters
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': ['scale', 'auto', 0.001, 0.01, 0.1, 1],
    'kernel': ['rbf', 'poly']
}

svc_grid = SVC(random_state=42)
grid_search = GridSearchCV(svc_grid, param_grid, cv=5, scoring='accuracy', n_jobs=-1)
grid_search.fit(X_train_scaled, y_train)

print(f"\nBest parameters: {grid_search.best_params_}")
print(f"Best cross-validation score: {grid_search.best_score_:.3f}")
print(f"Test accuracy: {grid_search.score(X_test_scaled, y_test):.3f}")

# Analyze best model
best_svc = grid_search.best_estimator_
print(f"Best model support vectors: {len(best_svc.support_)}")
print(f"Support vector ratio: {len(best_svc.support_) / len(X_train_scaled):.1%}")

Performance Comparison

import time
import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Generate large dataset
X, y = make_classification(n_samples=20000, n_features=20, n_informative=15,
                          n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Intel-optimized version
from sklearnex.svm import SVC as IntelSVC

start_time = time.time()
intel_svc = IntelSVC(kernel='rbf', C=1.0, gamma='scale', random_state=42)
intel_svc.fit(X_train_scaled, y_train)
intel_pred = intel_svc.predict(X_test_scaled)
intel_time = time.time() - start_time

print(f"Intel SVC time: {intel_time:.2f} seconds")
print(f"Intel SVC accuracy: {intel_svc.score(X_test_scaled, y_test):.3f}")
print(f"Intel support vectors: {len(intel_svc.support_)}")

# Standard scikit-learn version (for comparison)
from sklearn.svm import SVC as StandardSVC

start_time = time.time()
standard_svc = StandardSVC(kernel='rbf', C=1.0, gamma='scale', random_state=42)
standard_svc.fit(X_train_scaled, y_train)
standard_pred = standard_svc.predict(X_test_scaled)
standard_time = time.time() - start_time

print(f"Standard SVC time: {standard_time:.2f} seconds")
print(f"Standard SVC accuracy: {standard_svc.score(X_test_scaled, y_test):.3f}")
print(f"Standard support vectors: {len(standard_svc.support_)}")
print(f"Speedup: {standard_time / intel_time:.1f}x")

# Verify results are similar (may have slight differences due to optimization)
print(f"Accuracy difference: {abs(intel_svc.score(X_test_scaled, y_test) - standard_svc.score(X_test_scaled, y_test)):.4f}")

Performance Notes

SVM algorithms show significant speedups on datasets with >5000 samples
RBF and polynomial kernels benefit most from Intel optimization
Training time improvements are most noticeable with complex kernels
Prediction performance gains increase with the number of support vectors
Memory usage is comparable to standard scikit-learn versions
Results maintain high compatibility with scikit-learn implementations

Install with Tessl CLI

npx tessl i tessl/pypi-scikit-learn-intelex

docs

metrics-model-selection.md