0
# Core Conversion Functions
1
2
Primary functions for converting scikit-learn models to ONNX format. These functions form the foundation of the skl2onnx conversion system, providing both comprehensive control over the conversion process and simplified interfaces for common use cases.
3
4
## Capabilities
5
6
### Main Conversion Function
7
8
The primary conversion engine that transforms scikit-learn models to ONNX format with comprehensive control over all aspects of the conversion process.
9
10
```python { .api }
11
def convert_sklearn(model, name=None, initial_types=None, doc_string="",
12
target_opset=None, custom_conversion_functions=None,
13
custom_shape_calculators=None, custom_parsers=None,
14
options=None, intermediate=False, white_op=None,
15
black_op=None, final_types=None, dtype=None,
16
naming=None, model_optim=True, verbose=0):
17
"""
18
Convert a scikit-learn model to ONNX format.
19
20
Parameters:
21
- model: scikit-learn model or pipeline to convert
22
- name: str, name for the ONNX model (optional)
23
- initial_types: list of (name, type) tuples specifying input types
24
- doc_string: str, documentation string for the model (default "")
25
- target_opset: int, target ONNX opset version (defaults to latest tested)
26
- custom_conversion_functions: dict, custom converter functions
27
- custom_shape_calculators: dict, custom shape calculation functions
28
- custom_parsers: dict, custom parser functions
29
- options: dict, conversion options for specific operators
30
- intermediate: bool, return intermediate topology if True
31
- white_op: list, whitelist of allowed operators
32
- black_op: list, blacklist of forbidden operators
33
- final_types: list, expected output types for validation
34
- dtype: numpy dtype, default data type for inference
35
- naming: str, naming convention for variables ('new' or 'old')
36
- model_optim: bool, enable model optimization (default True)
37
- verbose: int, verbosity level (0=silent, 1=info, 2=debug)
38
39
Returns:
40
- ModelProto: ONNX model if intermediate=False
41
- tuple: (ModelProto, Topology) if intermediate=True
42
"""
43
```
44
45
### Simplified Conversion with Type Inference
46
47
High-level conversion function that automatically infers types from sample data, providing a simpler interface for common conversion scenarios.
48
49
```python { .api }
50
def to_onnx(model, X=None, name=None, initial_types=None, target_opset=None,
51
options=None, white_op=None, black_op=None, final_types=None,
52
dtype=None, naming=None, model_optim=True, verbose=0):
53
"""
54
Convert scikit-learn model to ONNX with automatic type inference.
55
56
Parameters:
57
- model: scikit-learn model or pipeline to convert
58
- X: array-like, sample input data for type inference (optional if initial_types provided)
59
- name: str, name for the ONNX model (optional)
60
- initial_types: list of (name, type) tuples specifying input types (optional if X provided)
61
- target_opset: int, target ONNX opset version (defaults to latest tested)
62
- options: dict, conversion options for specific operators
63
- white_op: list, whitelist of allowed operators
64
- black_op: list, blacklist of forbidden operators
65
- final_types: list, expected output types for validation
66
- dtype: numpy dtype, default data type for inference
67
- naming: str, naming convention for variables ('new' or 'old')
68
- model_optim: bool, enable model optimization (default True)
69
- verbose: int, verbosity level (0=silent, 1=info, 2=debug)
70
71
Returns:
72
- ModelProto: ONNX model
73
"""
74
```
75
76
### ONNX Mixin Enhancement
77
78
Combines a scikit-learn model class with ONNX operator capabilities, creating an enhanced model that can directly use ONNX operators.
79
80
```python { .api }
81
def wrap_as_onnx_mixin(model, target_opset=None):
82
"""
83
Enhance scikit-learn model with ONNX operator capabilities.
84
85
Parameters:
86
- model: scikit-learn model instance
87
- target_opset: int, target ONNX opset version (optional)
88
89
Returns:
90
- Enhanced model object with OnnxOperatorMixin capabilities
91
"""
92
```
93
94
## Usage Examples
95
96
### Basic Model Conversion
97
98
```python
99
from sklearn.linear_model import LogisticRegression
100
from sklearn.datasets import make_classification
101
from skl2onnx import convert_sklearn, to_onnx
102
from skl2onnx.common.data_types import FloatTensorType
103
import numpy as np
104
105
# Create and train model
106
X, y = make_classification(n_samples=100, n_features=4, random_state=42)
107
model = LogisticRegression()
108
model.fit(X, y)
109
110
# Method 1: Automatic type inference
111
onnx_model = to_onnx(model, X)
112
113
# Method 2: Explicit type specification
114
initial_type = [('float_input', FloatTensorType([None, 4]))]
115
onnx_model = convert_sklearn(model, initial_types=initial_type)
116
```
117
118
### Pipeline Conversion
119
120
```python
121
from sklearn.pipeline import Pipeline
122
from sklearn.preprocessing import StandardScaler
123
from sklearn.ensemble import RandomForestClassifier
124
125
# Create pipeline
126
pipeline = Pipeline([
127
('scaler', StandardScaler()),
128
('classifier', RandomForestClassifier(n_estimators=10))
129
])
130
pipeline.fit(X, y)
131
132
# Convert pipeline
133
onnx_model = to_onnx(pipeline, X,
134
name="sklearn_pipeline",
135
doc_string="Random Forest with StandardScaler preprocessing")
136
```
137
138
### Custom Options
139
140
```python
141
# Conversion with custom options
142
options = {
143
'RandomForestClassifier': {'zipmap': False}, # Don't use zipmap for output
144
'StandardScaler': {'div': 'div_cast'} # Use specific division operator
145
}
146
147
onnx_model = convert_sklearn(model,
148
initial_types=initial_type,
149
options=options,
150
target_opset=18,
151
verbose=1)
152
```
153
154
### Advanced Conversion Control
155
156
```python
157
# Conversion with operator filtering and validation
158
onnx_model = convert_sklearn(
159
model,
160
initial_types=initial_type,
161
white_op=['MatMul', 'Add', 'Relu'], # Only allow these operators
162
final_types=[('probabilities', FloatTensorType([None, 2]))], # Validate output
163
dtype=np.float32, # Force float32 precision
164
naming='new', # Use new variable naming convention
165
model_optim=True, # Enable model optimization
166
verbose=2 # Debug level logging
167
)
168
```
169
170
### Mixin Enhancement
171
172
```python
173
from skl2onnx import wrap_as_onnx_mixin
174
175
# Enhance model with ONNX capabilities
176
enhanced_model = wrap_as_onnx_mixin(model, target_opset=18)
177
178
# Now the model has additional ONNX-related methods
179
# enhanced_model can use ONNX operators directly
180
```
181
182
## Common Conversion Options
183
184
The `options` parameter allows fine-tuning of operator-specific behavior:
185
186
### Classifier Options
187
- `'zipmap'`: bool - Use ZipMap for probability output (default True)
188
- `'nocl'`: bool - Don't include class labels in output
189
- `'output_class_labels'`: bool - Include predicted class labels
190
191
### Text Processing Options
192
- `'separators'`: list - Custom separators for text tokenization
193
- `'regex'`: str - Custom regex pattern for text processing
194
195
### Numerical Options
196
- `'div'`: str - Division operator variant ('div', 'div_cast')
197
- `'cast'`: bool - Enable automatic type casting
198
199
## Error Handling
200
201
Common conversion errors and their meanings:
202
203
- **`MissingConverter`** - No converter registered for the model type
204
- **`MissingShapeCalculator`** - Shape inference failed for an operator
205
- **`TypeError`** - Incompatible data types in conversion
206
- **`ValueError`** - Invalid parameter values or model configuration
207
- **`RuntimeError`** - Conversion process failed due to unsupported operations