Convert scikit-learn models to ONNX format for cross-platform inference and deployment
—
Primary functions for converting scikit-learn models to ONNX format. These functions form the foundation of the skl2onnx conversion system, providing both comprehensive control over the conversion process and simplified interfaces for common use cases.
The primary conversion engine that transforms scikit-learn models to ONNX format with comprehensive control over all aspects of the conversion process.
def convert_sklearn(model, name=None, initial_types=None, doc_string="",
target_opset=None, custom_conversion_functions=None,
custom_shape_calculators=None, custom_parsers=None,
options=None, intermediate=False, white_op=None,
black_op=None, final_types=None, dtype=None,
naming=None, model_optim=True, verbose=0):
"""
Convert a scikit-learn model to ONNX format.
Parameters:
- model: scikit-learn model or pipeline to convert
- name: str, name for the ONNX model (optional)
- initial_types: list of (name, type) tuples specifying input types
- doc_string: str, documentation string for the model (default "")
- target_opset: int, target ONNX opset version (defaults to latest tested)
- custom_conversion_functions: dict, custom converter functions
- custom_shape_calculators: dict, custom shape calculation functions
- custom_parsers: dict, custom parser functions
- options: dict, conversion options for specific operators
- intermediate: bool, return intermediate topology if True
- white_op: list, whitelist of allowed operators
- black_op: list, blacklist of forbidden operators
- final_types: list, expected output types for validation
- dtype: numpy dtype, default data type for inference
- naming: str, naming convention for variables ('new' or 'old')
- model_optim: bool, enable model optimization (default True)
- verbose: int, verbosity level (0=silent, 1=info, 2=debug)
Returns:
- ModelProto: ONNX model if intermediate=False
- tuple: (ModelProto, Topology) if intermediate=True
"""High-level conversion function that automatically infers types from sample data, providing a simpler interface for common conversion scenarios.
def to_onnx(model, X=None, name=None, initial_types=None, target_opset=None,
options=None, white_op=None, black_op=None, final_types=None,
dtype=None, naming=None, model_optim=True, verbose=0):
"""
Convert scikit-learn model to ONNX with automatic type inference.
Parameters:
- model: scikit-learn model or pipeline to convert
- X: array-like, sample input data for type inference (optional if initial_types provided)
- name: str, name for the ONNX model (optional)
- initial_types: list of (name, type) tuples specifying input types (optional if X provided)
- target_opset: int, target ONNX opset version (defaults to latest tested)
- options: dict, conversion options for specific operators
- white_op: list, whitelist of allowed operators
- black_op: list, blacklist of forbidden operators
- final_types: list, expected output types for validation
- dtype: numpy dtype, default data type for inference
- naming: str, naming convention for variables ('new' or 'old')
- model_optim: bool, enable model optimization (default True)
- verbose: int, verbosity level (0=silent, 1=info, 2=debug)
Returns:
- ModelProto: ONNX model
"""Combines a scikit-learn model class with ONNX operator capabilities, creating an enhanced model that can directly use ONNX operators.
def wrap_as_onnx_mixin(model, target_opset=None):
"""
Enhance scikit-learn model with ONNX operator capabilities.
Parameters:
- model: scikit-learn model instance
- target_opset: int, target ONNX opset version (optional)
Returns:
- Enhanced model object with OnnxOperatorMixin capabilities
"""from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from skl2onnx import convert_sklearn, to_onnx
from skl2onnx.common.data_types import FloatTensorType
import numpy as np
# Create and train model
X, y = make_classification(n_samples=100, n_features=4, random_state=42)
model = LogisticRegression()
model.fit(X, y)
# Method 1: Automatic type inference
onnx_model = to_onnx(model, X)
# Method 2: Explicit type specification
initial_type = [('float_input', FloatTensorType([None, 4]))]
onnx_model = convert_sklearn(model, initial_types=initial_type)from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
# Create pipeline
pipeline = Pipeline([
('scaler', StandardScaler()),
('classifier', RandomForestClassifier(n_estimators=10))
])
pipeline.fit(X, y)
# Convert pipeline
onnx_model = to_onnx(pipeline, X,
name="sklearn_pipeline",
doc_string="Random Forest with StandardScaler preprocessing")# Conversion with custom options
options = {
'RandomForestClassifier': {'zipmap': False}, # Don't use zipmap for output
'StandardScaler': {'div': 'div_cast'} # Use specific division operator
}
onnx_model = convert_sklearn(model,
initial_types=initial_type,
options=options,
target_opset=18,
verbose=1)# Conversion with operator filtering and validation
onnx_model = convert_sklearn(
model,
initial_types=initial_type,
white_op=['MatMul', 'Add', 'Relu'], # Only allow these operators
final_types=[('probabilities', FloatTensorType([None, 2]))], # Validate output
dtype=np.float32, # Force float32 precision
naming='new', # Use new variable naming convention
model_optim=True, # Enable model optimization
verbose=2 # Debug level logging
)from skl2onnx import wrap_as_onnx_mixin
# Enhance model with ONNX capabilities
enhanced_model = wrap_as_onnx_mixin(model, target_opset=18)
# Now the model has additional ONNX-related methods
# enhanced_model can use ONNX operators directlyThe options parameter allows fine-tuning of operator-specific behavior:
'zipmap': bool - Use ZipMap for probability output (default True)'nocl': bool - Don't include class labels in output'output_class_labels': bool - Include predicted class labels'separators': list - Custom separators for text tokenization'regex': str - Custom regex pattern for text processing'div': str - Division operator variant ('div', 'div_cast')'cast': bool - Enable automatic type castingCommon conversion errors and their meanings:
MissingConverter - No converter registered for the model typeMissingShapeCalculator - Shape inference failed for an operatorTypeError - Incompatible data types in conversionValueError - Invalid parameter values or model configurationRuntimeError - Conversion process failed due to unsupported operationsInstall with Tessl CLI
npx tessl i tessl/pypi-skl2onnx