0
# Supervised Algorithms
1
2
Supervised metric learning algorithms that learn from labeled training data to optimize distance metrics for classification and related tasks. All algorithms inherit from MahalanobisMixin and follow the scikit-learn API.
3
4
## Capabilities
5
6
### Large Margin Nearest Neighbor (LMNN)
7
8
Learns a Mahalanobis distance metric in the k-NN classification setting, attempting to keep close k-nearest neighbors from the same class while separating examples from different classes by a large margin.
9
10
```python { .api }
11
class LMNN(MahalanobisMixin, TransformerMixin):
12
def __init__(self, init='auto', n_neighbors=3, min_iter=50, max_iter=1000, learn_rate=1e-7,
13
regularization=0.5, convergence_tol=0.001, verbose=False,
14
preprocessor=None, n_components=None, random_state=None):
15
"""
16
Parameters:
17
- init: str or array-like, initialization method ('auto', 'pca', 'lda', 'identity', 'random')
18
- n_neighbors: int, number of target neighbors per example
19
- min_iter: int, minimum number of iterations
20
- max_iter: int, maximum number of iterations
21
- learn_rate: float, learning rate for the optimization
22
- regularization: float, regularization parameter between 0 and 1
23
- convergence_tol: float, convergence tolerance
24
- verbose: bool, whether to print progress messages
25
- preprocessor: array-like or callable, preprocessor for input data
26
- n_components: int or None, dimensionality of transformed space
27
- random_state: int, random state for reproducibility
28
"""
29
30
def fit(self, X, y):
31
"""
32
Fit the LMNN metric learner.
33
34
Parameters:
35
- X: array-like, shape=(n_samples, n_features), training data
36
- y: array-like, shape=(n_samples,), training labels
37
38
Returns:
39
- self: returns the instance itself
40
"""
41
```
42
43
Usage example:
44
45
```python
46
from metric_learn import LMNN
47
from sklearn.datasets import load_iris
48
49
X, y = load_iris(return_X_y=True)
50
lmnn = LMNN(n_neighbors=3, learn_rate=1e-6)
51
lmnn.fit(X, y)
52
X_transformed = lmnn.transform(X)
53
```
54
55
### Neighborhood Components Analysis (NCA)
56
57
Learns a linear transformation to maximize the expected leave-one-out classification accuracy of the stochastic nearest neighbors rule in the transformed space.
58
59
```python { .api }
60
class NCA(MahalanobisMixin, TransformerMixin):
61
def __init__(self, init='auto', n_components=None, max_iter=100, tol=None, verbose=False, preprocessor=None, random_state=None):
62
"""
63
Parameters:
64
- init: str or array-like, initialization method ('auto', 'pca', 'lda', 'identity', 'random')
65
- n_components: int or None, dimensionality of transformed space
66
- max_iter: int, maximum number of iterations
67
- tol: float or None, convergence tolerance
68
- verbose: bool, whether to print progress messages
69
- preprocessor: array-like or callable, preprocessor for input data
70
- random_state: int, random state for reproducibility
71
"""
72
73
def fit(self, X, y):
74
"""
75
Fit the NCA metric learner.
76
77
Parameters:
78
- X: array-like, shape=(n_samples, n_features), training data
79
- y: array-like, shape=(n_samples,), training labels
80
81
Returns:
82
- self: returns the instance itself
83
"""
84
```
85
86
### Local Fisher Discriminant Analysis (LFDA)
87
88
Combines the ideas of Fisher Discriminant Analysis and locality-preserving projection for dimensionality reduction and metric learning, particularly effective when classes have multimodal distributions.
89
90
```python { .api }
91
class LFDA(MahalanobisMixin, TransformerMixin):
92
def __init__(self, n_components=None, k=None, embedding_type='weighted', preprocessor=None):
93
"""
94
Parameters:
95
- n_components: int or None, dimensionality of transformed space
96
- k: int or None, number of nearest neighbors for local scaling
97
- embedding_type: str, type of embedding ('weighted', 'orthonormalized', 'plain')
98
- preprocessor: array-like or callable, preprocessor for input data
99
"""
100
101
def fit(self, X, y):
102
"""
103
Fit the LFDA metric learner.
104
105
Parameters:
106
- X: array-like, shape=(n_samples, n_features), training data
107
- y: array-like, shape=(n_samples,), training labels
108
109
Returns:
110
- self: returns the instance itself
111
"""
112
```
113
114
Usage example:
115
116
```python
117
from metric_learn import LFDA
118
from sklearn.datasets import make_classification
119
120
X, y = make_classification(n_samples=200, n_features=10, n_classes=3, random_state=42)
121
lfda = LFDA(n_components=5, k=7)
122
lfda.fit(X, y)
123
X_transformed = lfda.transform(X)
124
```
125
126
## Supervised Variants of Weakly-Supervised Algorithms
127
128
Several algorithms have supervised variants that automatically generate constraints from class labels.
129
130
### Information Theoretic Metric Learning - Supervised
131
132
```python { .api }
133
class ITML_Supervised(ITML):
134
def fit(self, X, y, num_constraints=None):
135
"""
136
Fit ITML using automatically generated constraints from labels.
137
138
Parameters:
139
- X: array-like, shape=(n_samples, n_features), training data
140
- y: array-like, shape=(n_samples,), training labels
141
- num_constraints: int or None, number of constraints to generate
142
143
Returns:
144
- self: returns the instance itself
145
"""
146
```
147
148
### Least Squares Metric Learning - Supervised
149
150
```python { .api }
151
class LSML_Supervised(LSML):
152
def fit(self, X, y, num_constraints=None):
153
"""
154
Fit LSML using automatically generated constraints from labels.
155
156
Parameters:
157
- X: array-like, shape=(n_samples, n_features), training data
158
- y: array-like, shape=(n_samples,), training labels
159
- num_constraints: int or None, number of constraints to generate
160
161
Returns:
162
- self: returns the instance itself
163
"""
164
```
165
166
### Sparse Determinant Metric Learning - Supervised
167
168
```python { .api }
169
class SDML_Supervised(SDML):
170
def fit(self, X, y, num_constraints=None):
171
"""
172
Fit SDML using automatically generated constraints from labels.
173
174
Parameters:
175
- X: array-like, shape=(n_samples, n_features), training data
176
- y: array-like, shape=(n_samples,), training labels
177
- num_constraints: int or None, number of constraints to generate
178
179
Returns:
180
- self: returns the instance itself
181
"""
182
```
183
184
### Relative Components Analysis - Supervised
185
186
```python { .api }
187
class RCA_Supervised(RCA):
188
def fit(self, X, y, num_chunks=100):
189
"""
190
Fit RCA using automatically generated constraints from labels.
191
192
Parameters:
193
- X: array-like, shape=(n_samples, n_features), training data
194
- y: array-like, shape=(n_samples,), training labels
195
- num_chunks: int, number of chunks to generate
196
197
Returns:
198
- self: returns the instance itself
199
"""
200
```
201
202
### Mahalanobis Metric for Clustering - Supervised
203
204
```python { .api }
205
class MMC_Supervised(MMC):
206
def fit(self, X, y, num_constraints=None):
207
"""
208
Fit MMC using automatically generated constraints from labels.
209
210
Parameters:
211
- X: array-like, shape=(n_samples, n_features), training data
212
- y: array-like, shape=(n_samples,), training labels
213
- num_constraints: int or None, number of constraints to generate
214
215
Returns:
216
- self: returns the instance itself
217
"""
218
```
219
220
### Sparse Compositional Metric Learning - Supervised
221
222
```python { .api }
223
class SCML_Supervised(SCML):
224
def fit(self, X, y, num_constraints=None):
225
"""
226
Fit SCML using automatically generated constraints from labels.
227
228
Parameters:
229
- X: array-like, shape=(n_samples, n_features), training data
230
- y: array-like, shape=(n_samples,), training labels
231
- num_constraints: int or None, number of constraints to generate
232
233
Returns:
234
- self: returns the instance itself
235
"""
236
```
237
238
## Common Usage Patterns
239
240
All supervised algorithms follow similar usage patterns:
241
242
```python
243
from metric_learn import LMNN, NCA, LFDA
244
from sklearn.datasets import load_digits
245
from sklearn.model_selection import train_test_split
246
from sklearn.neighbors import KNeighborsClassifier
247
248
# Load data
249
X, y = load_digits(return_X_y=True)
250
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
251
252
# Train metric learner
253
metric_learner = LMNN(n_neighbors=3)
254
metric_learner.fit(X_train, y_train)
255
256
# Transform data
257
X_train_transformed = metric_learner.transform(X_train)
258
X_test_transformed = metric_learner.transform(X_test)
259
260
# Use with scikit-learn classifier
261
knn = KNeighborsClassifier(n_neighbors=3, metric=metric_learner.get_metric())
262
knn.fit(X_train, y_train)
263
accuracy = knn.score(X_test, y_test)
264
```