or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-scikit-learn-intelex

Intel Extension for Scikit-learn providing hardware-accelerated implementations of scikit-learn algorithms optimized for Intel CPUs and GPUs.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/scikit-learn-intelex@2024.7.x

To install, run

npx @tessl/cli install tessl/pypi-scikit-learn-intelex@2024.7.0

0

# Scikit-learn Intel Extension

1

2

Intel's Extension for Scikit-learn provides hardware-accelerated implementations of scikit-learn algorithms optimized for Intel CPUs and GPUs. It offers seamless drop-in replacements for existing scikit-learn applications, delivering 10-100x performance improvements through Intel hardware optimization, vector instructions, and AI-specific memory optimizations without requiring code modifications.

3

4

## Package Information

5

6

- **Package Name**: scikit-learn-intelex

7

- **Language**: Python

8

- **Installation**: `pip install scikit-learn-intelex`

9

- **License**: Apache 2.0

10

11

## Core Imports

12

13

```python

14

import sklearnex

15

```

16

17

For enabling optimizations globally:

18

19

```python

20

from sklearnex import patch_sklearn

21

patch_sklearn()

22

```

23

24

Direct imports of optimized algorithms:

25

26

```python

27

from sklearnex.ensemble import RandomForestClassifier

28

from sklearnex.linear_model import LinearRegression

29

from sklearnex.cluster import KMeans

30

```

31

32

## Basic Usage

33

34

```python

35

import numpy as np

36

from sklearnex import patch_sklearn

37

patch_sklearn()

38

39

# After patching, all sklearn imports use Intel optimizations

40

from sklearn.ensemble import RandomForestClassifier

41

from sklearn.datasets import make_classification

42

43

# Generate sample data

44

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

45

46

# Use accelerated Random Forest (same API as sklearn)

47

rf = RandomForestClassifier(n_estimators=100, random_state=42)

48

rf.fit(X, y)

49

predictions = rf.predict(X)

50

51

print(f"Accuracy: {rf.score(X, y):.3f}")

52

```

53

54

Alternative approach using direct imports:

55

56

```python

57

import numpy as np

58

from sklearnex.ensemble import RandomForestClassifier

59

from sklearn.datasets import make_classification

60

61

X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

62

63

# Directly use Intel-optimized implementation

64

rf = RandomForestClassifier(n_estimators=100, random_state=42)

65

rf.fit(X, y)

66

predictions = rf.predict(X)

67

```

68

69

## Architecture

70

71

The package provides three main integration patterns:

72

73

- **Global Patching**: Replace sklearn implementations system-wide using `patch_sklearn()`

74

- **Direct Imports**: Import specific Intel-optimized algorithms directly from sklearnex modules

75

- **Distributed Computing**: Use SPMD (Single Program Multiple Data) variants for multi-node execution

76

77

All implementations maintain full API compatibility with scikit-learn while providing significant performance improvements through Intel hardware acceleration.

78

79

## Capabilities

80

81

### Patching and Configuration

82

83

Core functions for enabling Intel optimizations globally and managing configuration settings. These functions control how scikit-learn algorithms are accelerated.

84

85

```python { .api }

86

def patch_sklearn(): ...

87

def unpatch_sklearn(): ...

88

def sklearn_is_patched() -> bool: ...

89

def get_patch_map() -> dict: ...

90

def get_patch_names() -> list: ...

91

def is_patched_instance(estimator) -> bool: ...

92

def set_config(**params): ...

93

def get_config() -> dict: ...

94

def get_hyperparameters() -> dict: ...

95

```

96

97

[Patching and Configuration](./patching-config.md)

98

99

### Clustering Algorithms

100

101

High-performance implementations of clustering algorithms including K-means and DBSCAN with Intel hardware acceleration.

102

103

```python { .api }

104

class KMeans:

105

def __init__(self, n_clusters=8, **kwargs): ...

106

def fit(self, X, y=None): ...

107

def predict(self, X): ...

108

109

class DBSCAN:

110

def __init__(self, eps=0.5, min_samples=5, **kwargs): ...

111

def fit(self, X, y=None): ...

112

def fit_predict(self, X, y=None): ...

113

```

114

115

[Clustering](./clustering.md)

116

117

### Linear Models

118

119

Accelerated linear regression, logistic regression, and regularized models with Intel optimization for large datasets.

120

121

```python { .api }

122

class LinearRegression:

123

def __init__(self, **kwargs): ...

124

def fit(self, X, y): ...

125

def predict(self, X): ...

126

127

class LogisticRegression:

128

def __init__(self, **kwargs): ...

129

def fit(self, X, y): ...

130

def predict(self, X): ...

131

def predict_proba(self, X): ...

132

133

class Ridge:

134

def __init__(self, alpha=1.0, **kwargs): ...

135

136

class Lasso:

137

def __init__(self, alpha=1.0, **kwargs): ...

138

139

class ElasticNet:

140

def __init__(self, alpha=1.0, l1_ratio=0.5, **kwargs): ...

141

142

class IncrementalLinearRegression:

143

def __init__(self, **kwargs): ...

144

def partial_fit(self, X, y): ...

145

```

146

147

[Linear Models](./linear-models.md)

148

149

### Ensemble Methods

150

151

Intel-accelerated ensemble algorithms including Random Forest and Extra Trees for both classification and regression.

152

153

```python { .api }

154

class RandomForestClassifier:

155

def __init__(self, n_estimators=100, **kwargs): ...

156

def fit(self, X, y): ...

157

def predict(self, X): ...

158

def predict_proba(self, X): ...

159

160

class RandomForestRegressor:

161

def __init__(self, n_estimators=100, **kwargs): ...

162

163

class ExtraTreesClassifier:

164

def __init__(self, n_estimators=100, **kwargs): ...

165

166

class ExtraTreesRegressor:

167

def __init__(self, n_estimators=100, **kwargs): ...

168

```

169

170

[Ensemble Methods](./ensemble.md)

171

172

### Dimensionality Reduction

173

174

Principal Component Analysis with Intel acceleration for efficient dimensionality reduction on large datasets.

175

176

```python { .api }

177

class PCA:

178

def __init__(self, n_components=None, **kwargs): ...

179

def fit(self, X, y=None): ...

180

def transform(self, X): ...

181

def fit_transform(self, X, y=None): ...

182

```

183

184

[Decomposition](./decomposition.md)

185

186

### Nearest Neighbors

187

188

Accelerated k-nearest neighbors algorithms for classification, regression, and unsupervised learning with optimized distance computations.

189

190

```python { .api }

191

class KNeighborsClassifier:

192

def __init__(self, n_neighbors=5, **kwargs): ...

193

def fit(self, X, y): ...

194

def predict(self, X): ...

195

def predict_proba(self, X): ...

196

197

class KNeighborsRegressor:

198

def __init__(self, n_neighbors=5, **kwargs): ...

199

200

class NearestNeighbors:

201

def __init__(self, n_neighbors=5, **kwargs): ...

202

def fit(self, X, y=None): ...

203

def kneighbors(self, X=None, n_neighbors=None, return_distance=True): ...

204

205

class LocalOutlierFactor:

206

def __init__(self, n_neighbors=20, **kwargs): ...

207

def fit_predict(self, X): ...

208

```

209

210

[Nearest Neighbors](./neighbors.md)

211

212

### Support Vector Machines

213

214

Intel-optimized Support Vector Machine implementations for classification and regression with accelerated kernel computations.

215

216

```python { .api }

217

class SVC:

218

def __init__(self, **kwargs): ...

219

def fit(self, X, y): ...

220

def predict(self, X): ...

221

222

class SVR:

223

def __init__(self, **kwargs): ...

224

225

class NuSVC:

226

def __init__(self, **kwargs): ...

227

228

class NuSVR:

229

def __init__(self, **kwargs): ...

230

```

231

232

[Support Vector Machines](./svm.md)

233

234

### Metrics and Model Selection

235

236

Performance metrics and data splitting utilities with Intel acceleration for large-scale evaluation.

237

238

```python { .api }

239

def roc_auc_score(y_true, y_score, **kwargs): ...

240

def pairwise_distances(X, Y=None, metric='euclidean', **kwargs): ...

241

def train_test_split(*arrays, **options): ...

242

```

243

244

[Metrics and Model Selection](./metrics-model-selection.md)

245

246

### Basic Statistics and Manifold Learning

247

248

Statistical computations and manifold learning algorithms with Intel optimization.

249

250

```python { .api }

251

class BasicStatistics:

252

def __init__(self, **kwargs): ...

253

def fit(self, X, y=None): ...

254

255

class IncrementalBasicStatistics:

256

def __init__(self, **kwargs): ...

257

def partial_fit(self, X, y=None): ...

258

259

class IncrementalEmpiricalCovariance:

260

def __init__(self, **kwargs): ...

261

def fit(self, X, y=None): ...

262

def partial_fit(self, X, y=None): ...

263

264

class TSNE:

265

def __init__(self, n_components=2, **kwargs): ...

266

def fit_transform(self, X, y=None): ...

267

```

268

269

[Statistics and Manifold Learning](./stats-manifold.md)

270

271

### Model Builder API

272

273

Convert external gradient boosting models (XGBoost, LightGBM, CatBoost) to Intel oneDAL format for accelerated inference.

274

275

```python { .api }

276

from daal4py.mb import GBTDAALBaseModel, convert_model

277

278

def convert_model(model): ...

279

280

class GBTDAALBaseModel:

281

def __init__(self): ...

282

```

283

284

[Model Builder API](./daal4py-mb.md)

285

286

### Advanced Features

287

288

Preview and SPMD (distributed) capabilities for cutting-edge algorithms and multi-node execution.

289

290

```python { .api }

291

# Preview features (requires SKLEARNEX_PREVIEW environment variable)

292

from sklearnex.preview.covariance import EmpiricalCovariance

293

from sklearnex.preview.decomposition import IncrementalPCA

294

295

# SPMD distributed computing

296

from sklearnex.spmd.cluster import KMeans as SPMDKMeans

297

from sklearnex.spmd.linear_model import LinearRegression as SPMDLinearRegression

298

299

# Utility functions

300

from sklearnex.utils import get_namespace, _assert_all_finite

301

```

302

303

[Advanced Features](./advanced.md)

304

305

## Environment Variables

306

307

- **OFF_ONEDAL_IFACE**: Set to "1" to disable oneDAL interface

308

- **SKLEARNEX_PREVIEW**: Enable preview features

309

- **DALROOT**: Path to Intel oneDAL installation

310

311

## Performance Notes

312

313

- Expect 10-100x speedups on Intel hardware

314

- Optimizations work best with larger datasets (>1000 samples)

315

- All optimized algorithms maintain identical APIs to scikit-learn

316

- Can be used as drop-in replacements in existing code