Tessl Tile for pypi/scikit-learn@1.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

datasets.md feature-extraction.md index.md metrics.md model-selection.md neighbors.md pipelines.md preprocessing.md supervised-learning.md unsupervised-learning.md utilities.md

index.mddocs/

0
# scikit-learn
1

2
scikit-learn is a comprehensive machine learning library for Python that provides simple and efficient tools for predictive data analysis. It features various classification, regression, and clustering algorithms including support vector machines, random forests, gradient boosting, k-means, and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
3

4
## Package Information
5

6
**Name**: scikit-learn  
7
**Language**: Python  
8
**Installation**: `pip install scikit-learn`  
9
**Version**: 1.7.1
10

11
## Core Imports
12

13
```python
14
import sklearn
15
from sklearn import datasets
16
from sklearn.model_selection import train_test_split
17
from sklearn.preprocessing import StandardScaler
18
from sklearn.linear_model import LogisticRegression
19
from sklearn.ensemble import RandomForestClassifier
20
from sklearn.cluster import KMeans
21
from sklearn.metrics import accuracy_score, classification_report
22
```
23

24
## Basic Usage
25

26
Here's a simple example demonstrating scikit-learn's consistent API for machine learning:
27

28
```python
29
from sklearn.datasets import load_iris
30
from sklearn.model_selection import train_test_split
31
from sklearn.ensemble import RandomForestClassifier
32
from sklearn.metrics import accuracy_score
33

34
# Load dataset
35
iris = load_iris()
36
X, y = iris.data, iris.target
37

38
# Split data
39
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
40

41
# Train model
42
clf = RandomForestClassifier(n_estimators=100, random_state=42)
43
clf.fit(X_train, y_train)
44

45
# Make predictions
46
y_pred = clf.predict(X_test)
47

48
# Evaluate
49
accuracy = accuracy_score(y_test, y_pred)
50
print(f"Accuracy: {accuracy:.3f}")
51
```
52

53
## Architecture
54

55
scikit-learn follows several key design principles:
56

57
### Estimator Pattern
58
All learning algorithms follow the same interface:
59
- `fit(X, y)` - Learn from training data
60
- `predict(X)` - Make predictions on new data
61
- `transform(X)` - Transform data (for transformers)
62

63
### Pipeline Architecture
64
Combine multiple processing steps:
65

66
```python
67
from sklearn.pipeline import Pipeline
68
from sklearn.preprocessing import StandardScaler
69
from sklearn.svm import SVC
70

71
pipeline = Pipeline([
72
    ('scaler', StandardScaler()),
73
    ('classifier', SVC())
74
])
75
```
76

77
### Consistent API Design
78
- **Estimators**: All learning algorithms (classifiers, regressors, clusterers)
79
- **Transformers**: Data preprocessing and feature engineering
80
- **Meta-estimators**: Combine multiple estimators (ensembles, pipelines)
81

82
## Core Capabilities
83

84
### Supervised Learning
85
```python
86
# Classification
87
from sklearn.linear_model import LogisticRegression
88
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
89
from sklearn.svm import SVC
90
from sklearn.naive_bayes import GaussianNB
91

92
# Regression  
93
from sklearn.linear_model import LinearRegression, Ridge, Lasso
94
from sklearn.ensemble import RandomForestRegressor
95
from sklearn.svm import SVR
96
```
97

98
[Supervised Learning](./supervised-learning.md)
99

100
### Unsupervised Learning
101
```python
102
# Clustering
103
from sklearn.cluster import KMeans, DBSCAN, AgglomerativeClustering
104
from sklearn.mixture import GaussianMixture
105

106
# Dimensionality Reduction
107
from sklearn.decomposition import PCA, FastICA, NMF
108
from sklearn.manifold import TSNE, Isomap
109
```
110

111
[Unsupervised Learning](./unsupervised-learning.md)
112

113
### Data Preprocessing
114
```python
115
# Scaling and Normalization
116
from sklearn.preprocessing import StandardScaler, MinMaxScaler, RobustScaler
117

118
# Encoding
119
from sklearn.preprocessing import LabelEncoder, OneHotEncoder, OrdinalEncoder
120

121
# Feature Engineering
122
from sklearn.preprocessing import PolynomialFeatures
123
from sklearn.feature_selection import SelectKBest, RFE
124
```
125

126
[Data Preprocessing and Feature Engineering](./preprocessing.md)
127

128
### Model Selection and Evaluation
129
```python
130
# Cross-Validation
131
from sklearn.model_selection import cross_val_score, GridSearchCV, RandomizedSearchCV
132
from sklearn.model_selection import KFold, StratifiedKFold, train_test_split
133

134
# Metrics
135
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
136
from sklearn.metrics import mean_squared_error, r2_score, roc_auc_score
137
```
138

139
[Model Selection and Evaluation](./model-selection.md)
140

141
### Built-in Datasets
142
```python
143
# Load toy datasets
144
from sklearn.datasets import load_iris, load_diabetes, load_wine, load_breast_cancer
145

146
# Generate synthetic data
147
from sklearn.datasets import make_classification, make_regression, make_blobs
148

149
# Fetch real-world datasets
150
from sklearn.datasets import fetch_20newsgroups, fetch_california_housing
151
```
152

153
[Datasets and Data Generation](./datasets.md)
154

155
### Performance Metrics and Visualization
156
```python
157
# Classification metrics
158
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
159
from sklearn.metrics import ConfusionMatrixDisplay, RocCurveDisplay
160

161
# Regression metrics  
162
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
163
from sklearn.metrics import PredictionErrorDisplay
164
```
165

166
[Metrics and Visualization](./metrics.md)
167

168
### Feature Extraction and Text Processing
169
```python
170
# Text vectorization
171
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
172
from sklearn.feature_extraction.text import HashingVectorizer, TfidfTransformer
173

174
# Dictionary and hashing
175
from sklearn.feature_extraction import DictVectorizer, FeatureHasher
176

177
# Image processing
178
from sklearn.feature_extraction.image import img_to_graph, grid_to_graph
179
```
180

181
[Feature Extraction](./feature-extraction.md)
182

183
### Pipelines and Workflow Composition
184
```python
185
# Pipeline construction
186
from sklearn.pipeline import Pipeline, make_pipeline, FeatureUnion
187

188
# Column-wise transformations
189
from sklearn.compose import ColumnTransformer, make_column_transformer
190
from sklearn.compose import TransformedTargetRegressor
191
```
192

193
[Pipelines and Composition](./pipelines.md)
194

195
### Nearest Neighbors Algorithms
196
```python
197
# Classification and regression
198
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
199
from sklearn.neighbors import RadiusNeighborsClassifier, RadiusNeighborsRegressor
200

201
# Outlier detection and density estimation
202
from sklearn.neighbors import LocalOutlierFactor, KernelDensity
203
from sklearn.neighbors import NearestNeighbors, NearestCentroid
204
```
205

206
[Nearest Neighbors](./neighbors.md)
207

208
### Utilities and Configuration
209
```python
210
# Core utilities
211
from sklearn.base import clone
212
from sklearn import get_config, set_config, config_context
213

214
# Version and system information
215
import sklearn
216
sklearn.__version__, sklearn.show_versions()
217
```
218

219
[Utilities and Core Functions](./utilities.md)
220

221
## Version Information
222

223
```python
224
import sklearn
225
print(sklearn.__version__)  # "1.7.1"
226

227
# Get system information
228
sklearn.show_versions()
229
```
230

231
## Key Features
232

233
- **Consistent API**: All algorithms follow the same interface patterns
234
- **Comprehensive**: 300+ classes and 150+ functions covering all ML tasks
235
- **Well-tested**: Extensive test suite ensuring reliability
236
- **Documentation**: Comprehensive user guide and API reference
237
- **Community**: Large, active community with regular releases
238
- **Integration**: Works seamlessly with NumPy, SciPy, pandas, and matplotlib
239
- **Performance**: Optimized implementations with optional parallelization
240

241
scikit-learn provides everything needed for machine learning workflows, from data preprocessing to model evaluation, making it the go-to library for machine learning in Python.

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/