CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-skl2onnx

Convert scikit-learn models to ONNX format for cross-platform inference and deployment

Pending
Overview
Eval results
Files

registration.mddocs/

Registration and Extensibility

System for registering custom converters, parsers, and operators to extend skl2onnx support for new model types and third-party libraries. The registration system enables complete customization of the conversion process while maintaining the library's modular architecture.

Capabilities

Converter Registration

Register custom conversion functions for new model types or override existing converters.

def update_registered_converter(model, alias=None, shape_fct=None,
                               convert_fct=None, overwrite=False,
                               parser=None, options=None):
    """
    Register or update a converter for a model type.
    
    Parameters:
    - model: class or str, model class to register converter for
    - alias: str, alias name for the model (optional, defaults to class name)
    - shape_fct: function, shape calculation function for the model
    - convert_fct: function, conversion function that generates ONNX operators
    - overwrite: bool, whether to overwrite existing converter (default False)
    - parser: function, custom parser function for the model (optional)
    - options: dict, default options for this converter (optional)
    """

Parser Registration

Register custom parsing functions that extract conversion-relevant information from models.

def update_registered_parser(model, parser_fct=None, overwrite=False):
    """
    Register or update a parser for a model type.
    
    Parameters:
    - model: class, model class to register parser for
    - parser_fct: function, parser function that extracts model information
    - overwrite: bool, whether to overwrite existing parser (default False)
    """

Model Discovery

Discover supported models and their aliases in the conversion system.

def supported_converters(from_sklearn=False):
    """
    Get list of all supported model converters.
    
    Parameters:
    - from_sklearn: bool, if True return sklearn model names without 'Sklearn' prefix
    
    Returns:
    - list: Supported model names/aliases
    """

def get_model_alias(model_type):
    """
    Get the alias name for a model type.
    
    Parameters:
    - model_type: class, model class to get alias for
    
    Returns:
    - str: Alias name for the model type
    
    Raises:
    - KeyError: If model type is not registered
    """

Supported Models by Category

The library provides extensive support across all major scikit-learn model categories:

Classifiers (60+ Models)

Linear Classifiers

  • LogisticRegression - Logistic regression with various solvers and regularization
  • SGDClassifier - Stochastic gradient descent classifier
  • LinearSVC - Linear support vector classifier
  • Perceptron - Simple perceptron classifier
  • PassiveAggressiveClassifier - Passive-aggressive learning classifier
  • RidgeClassifier - Ridge regression classifier
  • RidgeClassifierCV - Ridge classifier with cross-validation

Tree-Based Classifiers

  • DecisionTreeClassifier - Decision tree classifier
  • RandomForestClassifier - Random forest ensemble classifier
  • ExtraTreesClassifier - Extremely randomized trees classifier
  • GradientBoostingClassifier - Gradient boosting classifier
  • HistGradientBoostingClassifier - Histogram-based gradient boosting

Ensemble Classifiers

  • AdaBoostClassifier - AdaBoost ensemble classifier
  • BaggingClassifier - Bootstrap aggregating classifier
  • VotingClassifier - Soft and hard voting classifier
  • StackingClassifier - Stacking ensemble classifier

Neural Network

  • MLPClassifier - Multi-layer perceptron classifier

Naive Bayes

  • GaussianNB - Gaussian naive Bayes
  • MultinomialNB - Multinomial naive Bayes
  • BernoulliNB - Bernoulli naive Bayes
  • CategoricalNB - Categorical naive Bayes
  • ComplementNB - Complement naive Bayes

Support Vector Machines

  • SVC - C-support vector classifier
  • NuSVC - Nu-support vector classifier

Neighbor-Based

  • KNeighborsClassifier - K-nearest neighbors classifier
  • RadiusNeighborsClassifier - Radius-based neighbors classifier

Meta-Classifiers

  • OneVsRestClassifier - One-vs-rest multiclass strategy
  • OneVsOneClassifier - One-vs-one multiclass strategy
  • CalibratedClassifierCV - Probability calibration with cross-validation
  • OutputCodeClassifier - Error-correcting output code classifier

Regressors (40+ Models)

Linear Regressors

  • LinearRegression - Ordinary least squares regression
  • Ridge - Ridge regression with L2 regularization
  • Lasso - Lasso regression with L1 regularization
  • ElasticNet - Elastic net regression combining L1 and L2
  • Lars - Least angle regression
  • LassoLars - Lasso regression using LARS algorithm
  • OrthogonalMatchingPursuit - Orthogonal matching pursuit
  • BayesianRidge - Bayesian ridge regression
  • ARDRegression - Automatic relevance determination regression
  • SGDRegressor - Stochastic gradient descent regressor
  • PassiveAggressiveRegressor - Passive-aggressive regressor
  • HuberRegressor - Huber robust regression
  • TheilSenRegressor - Theil-Sen robust regression
  • RANSACRegressor - RANSAC robust regression

Tree-Based Regressors

  • DecisionTreeRegressor - Decision tree regressor
  • RandomForestRegressor - Random forest ensemble regressor
  • ExtraTreesRegressor - Extremely randomized trees regressor
  • GradientBoostingRegressor - Gradient boosting regressor
  • HistGradientBoostingRegressor - Histogram-based gradient boosting

Ensemble Regressors

  • AdaBoostRegressor - AdaBoost ensemble regressor
  • BaggingRegressor - Bootstrap aggregating regressor
  • VotingRegressor - Averaging regressor
  • StackingRegressor - Stacking ensemble regressor

Neural Network

  • MLPRegressor - Multi-layer perceptron regressor

Support Vector Machines

  • SVR - Epsilon-support vector regression
  • LinearSVR - Linear support vector regression
  • NuSVR - Nu-support vector regression

Gaussian Processes

  • GaussianProcessRegressor - Gaussian process regression

Specialized Regressors

  • PoissonRegressor - Poisson regression for count data
  • GammaRegressor - Gamma regression for positive continuous targets
  • TweedieRegressor - Tweedie regression for insurance/risk modeling
  • QuantileRegressor - Quantile regression

Preprocessing and Transformers (30+ Models)

Scaling and Normalization

  • StandardScaler - Standardization (zero mean, unit variance)
  • MinMaxScaler - Min-max normalization to [0,1] range
  • RobustScaler - Robust scaling using median and IQR
  • MaxAbsScaler - Scale by maximum absolute value
  • Normalizer - L1, L2, or max normalization
  • QuantileTransformer - Quantile-based scaling
  • PowerTransformer - Power transformations (Box-Cox, Yeo-Johnson)

Encoding

  • OneHotEncoder - One-hot encoding for categorical features
  • OrdinalEncoder - Ordinal encoding for categorical features
  • LabelEncoder - Label encoding for target variables
  • LabelBinarizer - Binary encoding for multilabel targets
  • TargetEncoder - Target-based encoding for categorical features

Feature Engineering

  • PolynomialFeatures - Generate polynomial and interaction features
  • FeatureHasher - Hash-based feature vectorization
  • DictVectorizer - Convert dict objects to feature vectors

Text Processing

  • CountVectorizer - Convert text to token count vectors
  • TfidfVectorizer - Convert text to TF-IDF vectors
  • TfidfTransformer - Apply TF-IDF transformation
  • HashingVectorizer - Hash-based text vectorization

Imputation

  • SimpleImputer - Simple imputation strategies (mean, median, mode)
  • KNNImputer - K-nearest neighbors imputation
  • IterativeImputer - Iterative multivariate imputation

Decomposition

  • PCA - Principal component analysis
  • TruncatedSVD - Truncated singular value decomposition
  • KernelPCA - Kernel principal component analysis
  • IncrementalPCA - Incremental principal component analysis
  • FactorAnalysis - Factor analysis
  • FastICA - Independent component analysis
  • NMF - Non-negative matrix factorization
  • LatentDirichletAllocation - Latent Dirichlet allocation

Feature Selection

  • SelectKBest - Select k best features by score
  • SelectPercentile - Select top percentile of features
  • SelectFpr - Select by false positive rate
  • SelectFdr - Select by false discovery rate
  • SelectFwe - Select by family-wise error rate
  • RFE - Recursive feature elimination
  • RFECV - RFE with cross-validation
  • VarianceThreshold - Remove low-variance features
  • GenericUnivariateSelect - Configurable univariate feature selection

Discretization

  • KBinsDiscretizer - K-bins discretization
  • Binarizer - Binary thresholding

Pipelines and Composition

  • Pipeline - Sequential transformer and estimator pipeline
  • FeatureUnion - Concatenate results of multiple transformers
  • ColumnTransformer - Apply transformers to specific columns

Clustering and Outlier Detection

Clustering

  • KMeans - K-means clustering
  • MiniBatchKMeans - Mini-batch K-means clustering

Outlier Detection

  • IsolationForest - Isolation forest for outlier detection
  • LocalOutlierFactor - Local outlier factor
  • OneClassSVM - One-class support vector machine

Mixture Models

  • GaussianMixture - Gaussian mixture model
  • BayesianGaussianMixture - Bayesian Gaussian mixture model

Usage Examples

Registering a Custom Converter

from skl2onnx import update_registered_converter
from skl2onnx.common.data_types import FloatTensorType, Int64TensorType

# Define custom model
class CustomModel:
    def __init__(self):
        self.coef_ = None
        self.intercept_ = None
    
    def fit(self, X, y):
        # Custom fitting logic
        pass
    
    def predict(self, X):
        # Custom prediction logic
        pass

# Define shape calculator
def custom_shape_calculator(operator):
    """Calculate output shape for custom model."""
    input_shape = operator.inputs[0].shape
    return [('output', FloatTensorType(input_shape))]

# Define converter function
def custom_converter(scope, operator, container):
    """Convert custom model to ONNX operators."""
    # Implementation of ONNX operator generation
    pass

# Register the converter
update_registered_converter(
    CustomModel,
    alias='CustomModel',
    shape_fct=custom_shape_calculator,
    convert_fct=custom_converter
)

Registering a Custom Parser

from skl2onnx import update_registered_parser

def custom_parser(scope, model, inputs, custom_parsers=None):
    """Parse custom model and create operator."""
    # Extract model information and create operator
    pass

# Register the parser
update_registered_parser(CustomModel, custom_parser)

Discovering Supported Models

from skl2onnx import supported_converters, get_model_alias
from sklearn.ensemble import RandomForestClassifier

# Get all supported converters
all_converters = supported_converters()
print(f"Total supported converters: {len(all_converters)}")

# Get sklearn model names without prefix
sklearn_models = supported_converters(from_sklearn=True)
print(f"Supported sklearn models: {len(sklearn_models)}")

# Get alias for specific model
alias = get_model_alias(RandomForestClassifier)
print(f"RandomForestClassifier alias: {alias}")

Custom Converter with Options

def advanced_custom_converter(scope, operator, container):
    """Advanced converter with options support."""
    # Access custom options
    options = operator.raw_operator.get_options()
    custom_param = options.get('custom_param', 'default_value')
    
    # Generate ONNX operators based on options
    pass

# Register with default options
update_registered_converter(
    CustomModel,
    alias='AdvancedCustomModel',
    shape_fct=custom_shape_calculator,
    convert_fct=advanced_custom_converter,
    options={'custom_param': 'optimized_value'}
)

Extension Guidelines

Converter Function Requirements

  1. Function signature: (scope, operator, container)
  2. Generate ONNX operators using container methods
  3. Handle all model parameters and configurations
  4. Support different data types and shapes
  5. Include proper error handling for edge cases

Shape Calculator Requirements

  1. Function signature: (operator)
  2. Return list of tuples (name, type) for outputs
  3. Infer shapes based on input shapes and model properties
  4. Handle dynamic dimensions appropriately
  5. Consider all possible output formats

Parser Function Requirements

  1. Function signature: (scope, model, inputs, custom_parsers=None)
  2. Create operator objects representing the model
  3. Extract relevant model attributes for conversion
  4. Handle nested models in pipelines/ensembles
  5. Support custom parsing options

Best Practices

  • Test thoroughly with various input shapes and data types
  • Handle edge cases like empty inputs or extreme values
  • Follow existing conventions for naming and structure
  • Document custom options and their effects
  • Provide usage examples for complex custom converters

Install with Tessl CLI

npx tessl i tessl/pypi-skl2onnx

docs

algebra.md

conversion.md

data-types.md

helpers.md

index.md

registration.md

tile.json