0
# Clustering Metrics
1
2
Unsupervised learning evaluation metrics including mutual information, Rand indices, and silhouette analysis for cluster quality assessment and validation of clustering algorithms.
3
4
## Capabilities
5
6
### Information-Based Metrics
7
8
Measures based on information theory for evaluating cluster quality.
9
10
```python { .api }
11
class AdjustedMutualInfoScore(Metric):
12
def __init__(
13
self,
14
average: str = "arithmetic",
15
**kwargs
16
): ...
17
18
class MutualInfoScore(Metric):
19
def __init__(
20
self,
21
**kwargs
22
): ...
23
24
class NormalizedMutualInfoScore(Metric):
25
def __init__(
26
self,
27
average: str = "arithmetic",
28
**kwargs
29
): ...
30
```
31
32
### Rand-Based Metrics
33
34
Metrics based on the Rand index for measuring clustering similarity.
35
36
```python { .api }
37
class AdjustedRandScore(Metric):
38
def __init__(
39
self,
40
**kwargs
41
): ...
42
43
class RandScore(Metric):
44
def __init__(
45
self,
46
**kwargs
47
): ...
48
```
49
50
### Internal Validation Metrics
51
52
Metrics that evaluate clustering quality using the original data features.
53
54
```python { .api }
55
class CalinskiHarabaszScore(Metric):
56
def __init__(
57
self,
58
**kwargs
59
): ...
60
61
class DaviesBouldinScore(Metric):
62
def __init__(
63
self,
64
**kwargs
65
): ...
66
67
class DunnIndex(Metric):
68
def __init__(
69
self,
70
p: float = 2.0,
71
**kwargs
72
): ...
73
```
74
75
### Additional Clustering Metrics
76
77
```python { .api }
78
class FowlkesMallowsIndex(Metric):
79
def __init__(
80
self,
81
**kwargs
82
): ...
83
84
class ClusterAccuracy(Metric):
85
def __init__(
86
self,
87
**kwargs
88
): ...
89
90
class HomogeneityScore(Metric):
91
def __init__(
92
self,
93
**kwargs
94
): ...
95
96
class CompletenessScore(Metric):
97
def __init__(
98
self,
99
**kwargs
100
): ...
101
102
class VMeasureScore(Metric):
103
def __init__(
104
self,
105
beta: float = 1.0,
106
**kwargs
107
): ...
108
```
109
110
## Usage Examples
111
112
```python
113
import torch
114
from torchmetrics.clustering import (
115
AdjustedRandScore, NormalizedMutualInfoScore,
116
CalinskiHarabaszScore, DaviesBouldinScore
117
)
118
119
# Sample clustering results
120
pred_clusters = torch.randint(0, 3, (100,))
121
true_clusters = torch.randint(0, 3, (100,))
122
data = torch.randn(100, 10) # Original features
123
124
# External validation (compares with ground truth)
125
ari = AdjustedRandScore()
126
nmi = NormalizedMutualInfoScore()
127
128
# Internal validation (uses original data)
129
ch_score = CalinskiHarabaszScore()
130
db_score = DaviesBouldinScore()
131
132
# Compute external metrics
133
ari_result = ari(pred_clusters, true_clusters)
134
nmi_result = nmi(pred_clusters, true_clusters)
135
136
# Compute internal metrics
137
ch_result = ch_score(data, pred_clusters)
138
db_result = db_score(data, pred_clusters)
139
140
print(f"ARI: {ari_result:.4f}")
141
print(f"NMI: {nmi_result:.4f}")
142
print(f"Calinski-Harabasz: {ch_result:.4f}")
143
print(f"Davies-Bouldin: {db_result:.4f}")
144
```
145
146
## Types
147
148
```python { .api }
149
ClusterLabels = Tensor # Integer cluster assignments
150
FeatureData = Tensor # Original data features
151
AverageMethod = Union["arithmetic", "geometric"]
152
```