0
# Registration and Extensibility
1
2
System for registering custom converters, parsers, and operators to extend skl2onnx support for new model types and third-party libraries. The registration system enables complete customization of the conversion process while maintaining the library's modular architecture.
3
4
## Capabilities
5
6
### Converter Registration
7
8
Register custom conversion functions for new model types or override existing converters.
9
10
```python { .api }
11
def update_registered_converter(model, alias=None, shape_fct=None,
12
convert_fct=None, overwrite=False,
13
parser=None, options=None):
14
"""
15
Register or update a converter for a model type.
16
17
Parameters:
18
- model: class or str, model class to register converter for
19
- alias: str, alias name for the model (optional, defaults to class name)
20
- shape_fct: function, shape calculation function for the model
21
- convert_fct: function, conversion function that generates ONNX operators
22
- overwrite: bool, whether to overwrite existing converter (default False)
23
- parser: function, custom parser function for the model (optional)
24
- options: dict, default options for this converter (optional)
25
"""
26
```
27
28
### Parser Registration
29
30
Register custom parsing functions that extract conversion-relevant information from models.
31
32
```python { .api }
33
def update_registered_parser(model, parser_fct=None, overwrite=False):
34
"""
35
Register or update a parser for a model type.
36
37
Parameters:
38
- model: class, model class to register parser for
39
- parser_fct: function, parser function that extracts model information
40
- overwrite: bool, whether to overwrite existing parser (default False)
41
"""
42
```
43
44
### Model Discovery
45
46
Discover supported models and their aliases in the conversion system.
47
48
```python { .api }
49
def supported_converters(from_sklearn=False):
50
"""
51
Get list of all supported model converters.
52
53
Parameters:
54
- from_sklearn: bool, if True return sklearn model names without 'Sklearn' prefix
55
56
Returns:
57
- list: Supported model names/aliases
58
"""
59
60
def get_model_alias(model_type):
61
"""
62
Get the alias name for a model type.
63
64
Parameters:
65
- model_type: class, model class to get alias for
66
67
Returns:
68
- str: Alias name for the model type
69
70
Raises:
71
- KeyError: If model type is not registered
72
"""
73
```
74
75
## Supported Models by Category
76
77
The library provides extensive support across all major scikit-learn model categories:
78
79
### Classifiers (60+ Models)
80
81
#### Linear Classifiers
82
- **LogisticRegression** - Logistic regression with various solvers and regularization
83
- **SGDClassifier** - Stochastic gradient descent classifier
84
- **LinearSVC** - Linear support vector classifier
85
- **Perceptron** - Simple perceptron classifier
86
- **PassiveAggressiveClassifier** - Passive-aggressive learning classifier
87
- **RidgeClassifier** - Ridge regression classifier
88
- **RidgeClassifierCV** - Ridge classifier with cross-validation
89
90
#### Tree-Based Classifiers
91
- **DecisionTreeClassifier** - Decision tree classifier
92
- **RandomForestClassifier** - Random forest ensemble classifier
93
- **ExtraTreesClassifier** - Extremely randomized trees classifier
94
- **GradientBoostingClassifier** - Gradient boosting classifier
95
- **HistGradientBoostingClassifier** - Histogram-based gradient boosting
96
97
#### Ensemble Classifiers
98
- **AdaBoostClassifier** - AdaBoost ensemble classifier
99
- **BaggingClassifier** - Bootstrap aggregating classifier
100
- **VotingClassifier** - Soft and hard voting classifier
101
- **StackingClassifier** - Stacking ensemble classifier
102
103
#### Neural Network
104
- **MLPClassifier** - Multi-layer perceptron classifier
105
106
#### Naive Bayes
107
- **GaussianNB** - Gaussian naive Bayes
108
- **MultinomialNB** - Multinomial naive Bayes
109
- **BernoulliNB** - Bernoulli naive Bayes
110
- **CategoricalNB** - Categorical naive Bayes
111
- **ComplementNB** - Complement naive Bayes
112
113
#### Support Vector Machines
114
- **SVC** - C-support vector classifier
115
- **NuSVC** - Nu-support vector classifier
116
117
#### Neighbor-Based
118
- **KNeighborsClassifier** - K-nearest neighbors classifier
119
- **RadiusNeighborsClassifier** - Radius-based neighbors classifier
120
121
#### Meta-Classifiers
122
- **OneVsRestClassifier** - One-vs-rest multiclass strategy
123
- **OneVsOneClassifier** - One-vs-one multiclass strategy
124
- **CalibratedClassifierCV** - Probability calibration with cross-validation
125
- **OutputCodeClassifier** - Error-correcting output code classifier
126
127
### Regressors (40+ Models)
128
129
#### Linear Regressors
130
- **LinearRegression** - Ordinary least squares regression
131
- **Ridge** - Ridge regression with L2 regularization
132
- **Lasso** - Lasso regression with L1 regularization
133
- **ElasticNet** - Elastic net regression combining L1 and L2
134
- **Lars** - Least angle regression
135
- **LassoLars** - Lasso regression using LARS algorithm
136
- **OrthogonalMatchingPursuit** - Orthogonal matching pursuit
137
- **BayesianRidge** - Bayesian ridge regression
138
- **ARDRegression** - Automatic relevance determination regression
139
- **SGDRegressor** - Stochastic gradient descent regressor
140
- **PassiveAggressiveRegressor** - Passive-aggressive regressor
141
- **HuberRegressor** - Huber robust regression
142
- **TheilSenRegressor** - Theil-Sen robust regression
143
- **RANSACRegressor** - RANSAC robust regression
144
145
#### Tree-Based Regressors
146
- **DecisionTreeRegressor** - Decision tree regressor
147
- **RandomForestRegressor** - Random forest ensemble regressor
148
- **ExtraTreesRegressor** - Extremely randomized trees regressor
149
- **GradientBoostingRegressor** - Gradient boosting regressor
150
- **HistGradientBoostingRegressor** - Histogram-based gradient boosting
151
152
#### Ensemble Regressors
153
- **AdaBoostRegressor** - AdaBoost ensemble regressor
154
- **BaggingRegressor** - Bootstrap aggregating regressor
155
- **VotingRegressor** - Averaging regressor
156
- **StackingRegressor** - Stacking ensemble regressor
157
158
#### Neural Network
159
- **MLPRegressor** - Multi-layer perceptron regressor
160
161
#### Support Vector Machines
162
- **SVR** - Epsilon-support vector regression
163
- **LinearSVR** - Linear support vector regression
164
- **NuSVR** - Nu-support vector regression
165
166
#### Gaussian Processes
167
- **GaussianProcessRegressor** - Gaussian process regression
168
169
#### Specialized Regressors
170
- **PoissonRegressor** - Poisson regression for count data
171
- **GammaRegressor** - Gamma regression for positive continuous targets
172
- **TweedieRegressor** - Tweedie regression for insurance/risk modeling
173
- **QuantileRegressor** - Quantile regression
174
175
### Preprocessing and Transformers (30+ Models)
176
177
#### Scaling and Normalization
178
- **StandardScaler** - Standardization (zero mean, unit variance)
179
- **MinMaxScaler** - Min-max normalization to [0,1] range
180
- **RobustScaler** - Robust scaling using median and IQR
181
- **MaxAbsScaler** - Scale by maximum absolute value
182
- **Normalizer** - L1, L2, or max normalization
183
- **QuantileTransformer** - Quantile-based scaling
184
- **PowerTransformer** - Power transformations (Box-Cox, Yeo-Johnson)
185
186
#### Encoding
187
- **OneHotEncoder** - One-hot encoding for categorical features
188
- **OrdinalEncoder** - Ordinal encoding for categorical features
189
- **LabelEncoder** - Label encoding for target variables
190
- **LabelBinarizer** - Binary encoding for multilabel targets
191
- **TargetEncoder** - Target-based encoding for categorical features
192
193
#### Feature Engineering
194
- **PolynomialFeatures** - Generate polynomial and interaction features
195
- **FeatureHasher** - Hash-based feature vectorization
196
- **DictVectorizer** - Convert dict objects to feature vectors
197
198
#### Text Processing
199
- **CountVectorizer** - Convert text to token count vectors
200
- **TfidfVectorizer** - Convert text to TF-IDF vectors
201
- **TfidfTransformer** - Apply TF-IDF transformation
202
- **HashingVectorizer** - Hash-based text vectorization
203
204
#### Imputation
205
- **SimpleImputer** - Simple imputation strategies (mean, median, mode)
206
- **KNNImputer** - K-nearest neighbors imputation
207
- **IterativeImputer** - Iterative multivariate imputation
208
209
#### Decomposition
210
- **PCA** - Principal component analysis
211
- **TruncatedSVD** - Truncated singular value decomposition
212
- **KernelPCA** - Kernel principal component analysis
213
- **IncrementalPCA** - Incremental principal component analysis
214
- **FactorAnalysis** - Factor analysis
215
- **FastICA** - Independent component analysis
216
- **NMF** - Non-negative matrix factorization
217
- **LatentDirichletAllocation** - Latent Dirichlet allocation
218
219
#### Feature Selection
220
- **SelectKBest** - Select k best features by score
221
- **SelectPercentile** - Select top percentile of features
222
- **SelectFpr** - Select by false positive rate
223
- **SelectFdr** - Select by false discovery rate
224
- **SelectFwe** - Select by family-wise error rate
225
- **RFE** - Recursive feature elimination
226
- **RFECV** - RFE with cross-validation
227
- **VarianceThreshold** - Remove low-variance features
228
- **GenericUnivariateSelect** - Configurable univariate feature selection
229
230
#### Discretization
231
- **KBinsDiscretizer** - K-bins discretization
232
- **Binarizer** - Binary thresholding
233
234
#### Pipelines and Composition
235
- **Pipeline** - Sequential transformer and estimator pipeline
236
- **FeatureUnion** - Concatenate results of multiple transformers
237
- **ColumnTransformer** - Apply transformers to specific columns
238
239
### Clustering and Outlier Detection
240
241
#### Clustering
242
- **KMeans** - K-means clustering
243
- **MiniBatchKMeans** - Mini-batch K-means clustering
244
245
#### Outlier Detection
246
- **IsolationForest** - Isolation forest for outlier detection
247
- **LocalOutlierFactor** - Local outlier factor
248
- **OneClassSVM** - One-class support vector machine
249
250
#### Mixture Models
251
- **GaussianMixture** - Gaussian mixture model
252
- **BayesianGaussianMixture** - Bayesian Gaussian mixture model
253
254
## Usage Examples
255
256
### Registering a Custom Converter
257
258
```python
259
from skl2onnx import update_registered_converter
260
from skl2onnx.common.data_types import FloatTensorType, Int64TensorType
261
262
# Define custom model
263
class CustomModel:
264
def __init__(self):
265
self.coef_ = None
266
self.intercept_ = None
267
268
def fit(self, X, y):
269
# Custom fitting logic
270
pass
271
272
def predict(self, X):
273
# Custom prediction logic
274
pass
275
276
# Define shape calculator
277
def custom_shape_calculator(operator):
278
"""Calculate output shape for custom model."""
279
input_shape = operator.inputs[0].shape
280
return [('output', FloatTensorType(input_shape))]
281
282
# Define converter function
283
def custom_converter(scope, operator, container):
284
"""Convert custom model to ONNX operators."""
285
# Implementation of ONNX operator generation
286
pass
287
288
# Register the converter
289
update_registered_converter(
290
CustomModel,
291
alias='CustomModel',
292
shape_fct=custom_shape_calculator,
293
convert_fct=custom_converter
294
)
295
```
296
297
### Registering a Custom Parser
298
299
```python
300
from skl2onnx import update_registered_parser
301
302
def custom_parser(scope, model, inputs, custom_parsers=None):
303
"""Parse custom model and create operator."""
304
# Extract model information and create operator
305
pass
306
307
# Register the parser
308
update_registered_parser(CustomModel, custom_parser)
309
```
310
311
### Discovering Supported Models
312
313
```python
314
from skl2onnx import supported_converters, get_model_alias
315
from sklearn.ensemble import RandomForestClassifier
316
317
# Get all supported converters
318
all_converters = supported_converters()
319
print(f"Total supported converters: {len(all_converters)}")
320
321
# Get sklearn model names without prefix
322
sklearn_models = supported_converters(from_sklearn=True)
323
print(f"Supported sklearn models: {len(sklearn_models)}")
324
325
# Get alias for specific model
326
alias = get_model_alias(RandomForestClassifier)
327
print(f"RandomForestClassifier alias: {alias}")
328
```
329
330
### Custom Converter with Options
331
332
```python
333
def advanced_custom_converter(scope, operator, container):
334
"""Advanced converter with options support."""
335
# Access custom options
336
options = operator.raw_operator.get_options()
337
custom_param = options.get('custom_param', 'default_value')
338
339
# Generate ONNX operators based on options
340
pass
341
342
# Register with default options
343
update_registered_converter(
344
CustomModel,
345
alias='AdvancedCustomModel',
346
shape_fct=custom_shape_calculator,
347
convert_fct=advanced_custom_converter,
348
options={'custom_param': 'optimized_value'}
349
)
350
```
351
352
## Extension Guidelines
353
354
### Converter Function Requirements
355
1. **Function signature**: `(scope, operator, container)`
356
2. **Generate ONNX operators** using container methods
357
3. **Handle all model parameters** and configurations
358
4. **Support different data types** and shapes
359
5. **Include proper error handling** for edge cases
360
361
### Shape Calculator Requirements
362
1. **Function signature**: `(operator)`
363
2. **Return list of tuples** `(name, type)` for outputs
364
3. **Infer shapes** based on input shapes and model properties
365
4. **Handle dynamic dimensions** appropriately
366
5. **Consider all possible output formats**
367
368
### Parser Function Requirements
369
1. **Function signature**: `(scope, model, inputs, custom_parsers=None)`
370
2. **Create operator objects** representing the model
371
3. **Extract relevant model attributes** for conversion
372
4. **Handle nested models** in pipelines/ensembles
373
5. **Support custom parsing options**
374
375
### Best Practices
376
- **Test thoroughly** with various input shapes and data types
377
- **Handle edge cases** like empty inputs or extreme values
378
- **Follow existing conventions** for naming and structure
379
- **Document custom options** and their effects
380
- **Provide usage examples** for complex custom converters