or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-mlxtend

Machine Learning Library Extensions providing essential tools for day-to-day data science tasks

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/mlxtend@0.23.x

To install, run

npx @tessl/cli install tessl/pypi-mlxtend@0.23.0

0

# MLxtend

1

2

MLxtend (Machine Learning Extensions) is a comprehensive Python library that provides essential tools for day-to-day data science tasks, extending scikit-learn and other scientific computing libraries. The package offers advanced machine learning algorithms including ensemble methods, frequent pattern mining algorithms, feature selection and extraction techniques, model evaluation utilities, and specialized plotting functions for visualization of decision regions and model performance.

3

4

## Package Information

5

6

- **Package Name**: mlxtend

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install mlxtend`

10

- **Version**: 0.23.4

11

- **License**: BSD 3-Clause

12

13

## Core Imports

14

15

```python

16

import mlxtend

17

```

18

19

Common import patterns for specific modules:

20

21

```python

22

from mlxtend.classifier import EnsembleVoteClassifier, StackingClassifier

23

from mlxtend.feature_selection import SequentialFeatureSelector

24

from mlxtend.plotting import plot_decision_regions, plot_learning_curves

25

from mlxtend.evaluate import mcnemar, bootstrap_point632_score

26

from mlxtend.frequent_patterns import apriori, association_rules

27

```

28

29

## Basic Usage

30

31

```python

32

from mlxtend.classifier import EnsembleVoteClassifier

33

from mlxtend.plotting import plot_decision_regions

34

from sklearn.ensemble import RandomForestClassifier

35

from sklearn.svm import SVC

36

from sklearn.linear_model import LogisticRegression

37

from sklearn.datasets import make_classification

38

import matplotlib.pyplot as plt

39

40

# Create sample data

41

X, y = make_classification(n_samples=1000, n_features=2, n_redundant=0,

42

n_informative=2, random_state=42, n_clusters_per_class=1)

43

44

# Create ensemble classifier

45

clf1 = LogisticRegression(random_state=42)

46

clf2 = RandomForestClassifier(random_state=42)

47

clf3 = SVC(probability=True, random_state=42)

48

49

ensemble = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], voting='soft')

50

ensemble.fit(X, y)

51

52

# Visualize decision regions

53

plot_decision_regions(X, y, clf=ensemble, legend=2)

54

plt.title('Ensemble Classifier Decision Regions')

55

plt.show()

56

```

57

58

## Architecture

59

60

MLxtend is organized into 14 specialized modules, each focusing on specific aspects of machine learning:

61

62

- **Classifiers**: Nine advanced classification algorithms including ensemble methods and neural networks

63

- **Regressors**: Stacking ensemble methods for regression tasks

64

- **Feature Engineering**: Selection, extraction, and preprocessing tools

65

- **Evaluation**: Comprehensive model evaluation and statistical testing utilities

66

- **Visualization**: Specialized plotting functions for ML model analysis

67

- **Pattern Mining**: Association rule and frequent pattern mining algorithms

68

- **Utilities**: Mathematical functions, text processing, and file I/O tools

69

70

This modular design allows users to import only the functionality they need while maintaining compatibility with the broader Python scientific ecosystem, particularly scikit-learn.

71

72

## Capabilities

73

74

### Classification Algorithms

75

76

Advanced classification methods including ensemble voting, stacking, neural networks, and classic algorithms like perceptron and logistic regression.

77

78

```python { .api }

79

class EnsembleVoteClassifier:

80

def __init__(self, clfs, voting='hard', weights=None): ...

81

def fit(self, X, y): ...

82

def predict(self, X): ...

83

def predict_proba(self, X): ...

84

85

class StackingClassifier:

86

def __init__(self, classifiers, meta_classifier): ...

87

def fit(self, X, y): ...

88

def predict(self, X): ...

89

90

class MultiLayerPerceptron:

91

def __init__(self, eta=0.5, epochs=50, hidden_layers=[50]): ...

92

def fit(self, X, y): ...

93

def predict(self, X): ...

94

```

95

96

[Classification Algorithms](./classification.md)

97

98

### Feature Selection and Extraction

99

100

Tools for selecting optimal feature subsets and extracting new features through dimensionality reduction techniques.

101

102

```python { .api }

103

class SequentialFeatureSelector:

104

def __init__(self, estimator, k_features=1, forward=True, scoring=None): ...

105

def fit(self, X, y): ...

106

def transform(self, X): ...

107

108

class PrincipalComponentAnalysis:

109

def __init__(self, n_components=None): ...

110

def fit(self, X, y=None): ...

111

def transform(self, X): ...

112

113

class LinearDiscriminantAnalysis:

114

def __init__(self, n_discriminants=None): ...

115

def fit(self, X, y): ...

116

def transform(self, X): ...

117

```

118

119

[Feature Engineering](./feature-engineering.md)

120

121

### Model Evaluation and Testing

122

123

Comprehensive model evaluation tools including statistical tests, bootstrap methods, and cross-validation utilities.

124

125

```python { .api }

126

def mcnemar(ary, corrected=True, exact=False):

127

"""McNemar test for classifier comparison"""

128

129

def bootstrap_point632_score(estimator, X, y, n_splits=200, method='.632+'):

130

"""Bootstrap .632 and .632+ error estimation"""

131

132

def paired_ttest_5x2cv(estimator1, estimator2, X, y, scoring=None):

133

"""5x2cv paired t-test for comparing classifiers"""

134

135

class BootstrapOutOfBag:

136

def __init__(self, n_splits=200, random_state=None): ...

137

def split(self, X, y=None): ...

138

```

139

140

[Model Evaluation](./evaluation.md)

141

142

### Visualization Tools

143

144

Specialized plotting functions for machine learning model analysis including decision regions, learning curves, and confusion matrices.

145

146

```python { .api }

147

def plot_decision_regions(X, y, clf, feature_idx=None, filler_feature_values=None):

148

"""Plot decision regions for 2D datasets"""

149

150

def plot_learning_curves(X_train, y_train, X_test, y_test, clf, scoring='misclassification error'):

151

"""Plot learning curves"""

152

153

def plot_confusion_matrix(conf_mat, hide_spines=False, hide_ticks=False, figsize=None):

154

"""Plot confusion matrix"""

155

156

def plot_sequential_feature_selection(metric_dict, kind='std_dev', color='blue'):

157

"""Plot sequential feature selection results"""

158

```

159

160

[Visualization Tools](./plotting.md)

161

162

### Frequent Pattern Mining

163

164

Association rule mining and frequent pattern discovery algorithms for transaction data analysis.

165

166

```python { .api }

167

def apriori(df, min_support=0.5, use_colnames=False, max_len=None):

168

"""Apriori algorithm for frequent itemset mining"""

169

170

def association_rules(df, metric="confidence", min_threshold=0.8):

171

"""Generate association rules from frequent itemsets"""

172

173

def fpgrowth(df, min_support=0.5, use_colnames=False, max_len=None):

174

"""FP-Growth algorithm for frequent itemset mining"""

175

176

def fpmax(df, min_support=0.5, use_colnames=False):

177

"""FPMax algorithm for maximal frequent itemsets"""

178

```

179

180

[Pattern Mining](./pattern-mining.md)

181

182

### Data Preprocessing

183

184

Data transformation utilities including scaling, encoding, and array manipulation functions.

185

186

```python { .api }

187

class MeanCenterer:

188

def fit(self, X): ...

189

def transform(self, X): ...

190

191

class TransactionEncoder:

192

def fit(self, X): ...

193

def transform(self, X): ...

194

195

def standardize(array, columns=None, ddof=0):

196

"""Standardize features by removing mean and scaling to unit variance"""

197

198

def minmax_scaling(array, columns=None, min_val=0, max_val=1):

199

"""Min-max feature scaling"""

200

```

201

202

[Data Preprocessing](./preprocessing.md)

203

204

### Clustering Algorithms

205

206

Unsupervised learning algorithms for data clustering and pattern discovery.

207

208

```python { .api }

209

class Kmeans:

210

def __init__(self, k, max_iter=100, convergence_tolerance=1e-05): ...

211

def fit(self, X): ...

212

def predict(self, X): ...

213

```

214

215

[Clustering](./clustering.md)

216

217

### Dataset Loading

218

219

Utilities for loading common machine learning datasets and generating synthetic data.

220

221

```python { .api }

222

def iris_data():

223

"""Load the Iris dataset"""

224

225

def wine_data():

226

"""Load the Wine dataset"""

227

228

def mnist_data():

229

"""Load the MNIST dataset"""

230

231

def boston_housing_data():

232

"""Load the Boston Housing dataset"""

233

```

234

235

[Dataset Loading](./datasets.md)

236

237

### Regression Algorithms

238

239

Ensemble regression methods including stacking for improved prediction performance.

240

241

```python { .api }

242

class LinearRegression:

243

def __init__(self, eta=0.01, epochs=50): ...

244

def fit(self, X, y): ...

245

def predict(self, X): ...

246

247

class StackingRegressor:

248

def __init__(self, regressors, meta_regressor): ...

249

def fit(self, X, y): ...

250

def predict(self, X): ...

251

```

252

253

[Regression Algorithms](./regression.md)

254

255

### Mathematical Utilities

256

257

Mathematical functions and utilities commonly used in machine learning computations.

258

259

```python { .api }

260

def num_combinations(n, r):

261

"""Calculate number of combinations"""

262

263

def num_permutations(n, r):

264

"""Calculate number of permutations"""

265

266

def factorial(n):

267

"""Calculate factorial"""

268

269

def vectorspace_orthonormalization(ary):

270

"""Orthonormalize vectors using Gram-Schmidt process"""

271

```

272

273

[Mathematical Utilities](./math-utils.md)

274

275

### Text Processing

276

277

Text processing utilities for natural language processing tasks.

278

279

```python { .api }

280

def generalize_names(name):

281

"""Generalize person names for consistency"""

282

283

def tokenizer_words_and_emoticons(text):

284

"""Tokenize text including emoticons"""

285

286

def tokenizer_emoticons(text):

287

"""Extract emoticons from text"""

288

```

289

290

[Text Processing](./text-processing.md)

291

292

### File I/O Utilities

293

294

File system utilities for finding and organizing files.

295

296

```python { .api }

297

def find_files(substring, path, recursive=True, check_ext=None, ignore_invisible=True):

298

"""Find files matching criteria"""

299

300

def find_filegroups(paths, substring='', extensions=None, ignore_invisible=True):

301

"""Group files by specified criteria"""

302

```

303

304

[File I/O](./file-io.md)

305

306

### General Utilities

307

308

General-purpose utilities for testing, data validation, and parameter handling.

309

310

```python { .api }

311

class Counter:

312

def __init__(self, iterable=None): ...

313

def update(self, iterable): ...

314

def most_common(self, n=None): ...

315

316

def check_Xy(X, y, y_int=True):

317

"""Validate input data format"""

318

319

def assert_raises(exception_type, callable_obj, *args, **kwargs):

320

"""Test utility for verifying exceptions"""

321

```

322

323

[General Utilities](./utilities.md)

324

325

## Types

326

327

```python { .api }

328

# Core types used across multiple modules

329

from typing import Union, Optional, List, Tuple, Dict, Any

330

from numpy import ndarray

331

from pandas import DataFrame

332

333

# Common type aliases

334

ArrayLike = Union[ndarray, List, Tuple]

335

DataFrameLike = Union[DataFrame, ndarray]

336

ClassifierLike = object # sklearn-compatible classifier

337

RegressorLike = object # sklearn-compatible regressor

338

```