or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-features.mdcore-models.mddata-handling.mddatasets.mdevaluation.mdfeature-analysis.mdindex.mdmetrics.mdtraining-evaluation.mdutilities.mdvisualization.md

metrics.mddocs/

0

# Metrics Framework

1

2

CatBoost provides a comprehensive metrics framework for evaluating model performance across various machine learning tasks. The framework includes built-in metrics for classification, regression, and ranking, with dynamic class generation for metric-specific functionality.

3

4

## Capabilities

5

6

### Base Metric Infrastructure

7

8

Core base class and infrastructure for all CatBoost metrics.

9

10

```python { .api }

11

class BuiltinMetric:

12

"""

13

Base class for all CatBoost built-in metrics.

14

15

Provides common interface and functionality for metric evaluation,

16

parameter validation, and configuration management across all

17

metric types in CatBoost.

18

"""

19

20

@staticmethod

21

def params_with_defaults():

22

"""

23

Get valid metric parameters with their default values.

24

25

Returns:

26

dict: Parameter names mapped to default values and mandatory flags

27

- 'default_value': Default parameter value or None

28

- 'is_mandatory': Whether parameter is required (bool)

29

"""

30

31

def __str__(self):

32

"""

33

Get string representation of the metric with parameters.

34

35

Returns:

36

str: Metric string representation

37

"""

38

39

def set_hints(self, **hints):

40

"""

41

Set hints for metric calculation (not validated).

42

43

Parameters:

44

- **hints: Arbitrary hint parameters for metric behavior

45

46

Returns:

47

self: For method chaining

48

"""

49

50

def eval(self, label, approx, weight=None, group_id=None,

51

group_weight=None, subgroup_id=None, pairs=None,

52

thread_count=-1):

53

"""

54

Evaluate metric with raw predictions and labels.

55

56

Parameters:

57

- label: True target values (array-like)

58

- approx: Model predictions (array-like)

59

- weight: Sample weights (array-like, optional)

60

- group_id: Group identifiers for ranking (array-like, optional)

61

- group_weight: Group weights (array-like, optional)

62

- subgroup_id: Subgroup identifiers (array-like, optional)

63

- pairs: Pairwise constraints for ranking (array-like or path, optional)

64

- thread_count: Number of threads for computation (int)

65

66

Returns:

67

float: Metric value

68

"""

69

70

def is_max_optimal(self):

71

"""

72

Check if higher metric values indicate better performance.

73

74

Returns:

75

bool: True if metric should be maximized, False if minimized

76

"""

77

78

def is_min_optimal(self):

79

"""

80

Check if lower metric values indicate better performance.

81

82

Returns:

83

bool: True if metric should be minimized, False if maximized

84

"""

85

```

86

87

### Dynamic Metric Classes

88

89

CatBoost dynamically generates metric classes based on the underlying C++ implementation. Each metric type has specific variants with different parameter configurations.

90

91

```python { .api }

92

# Classification Metrics (examples of dynamically generated classes)

93

class Logloss(BuiltinMetric):

94

"""Logarithmic loss for binary and multi-class classification."""

95

96

class CrossEntropy(BuiltinMetric):

97

"""Cross-entropy loss for classification tasks."""

98

99

class MultiClass(BuiltinMetric):

100

"""Multi-class classification accuracy."""

101

102

class Accuracy(BuiltinMetric):

103

"""Classification accuracy metric."""

104

105

class Precision(BuiltinMetric):

106

"""Precision metric for classification."""

107

108

class Recall(BuiltinMetric):

109

"""Recall metric for classification."""

110

111

class F1(BuiltinMetric):

112

"""F1-score metric for classification."""

113

114

class AUC(BuiltinMetric):

115

"""Area Under the ROC Curve metric."""

116

117

# Regression Metrics

118

class RMSE(BuiltinMetric):

119

"""Root Mean Squared Error for regression."""

120

121

class MAE(BuiltinMetric):

122

"""Mean Absolute Error for regression."""

123

124

class MAPE(BuiltinMetric):

125

"""Mean Absolute Percentage Error for regression."""

126

127

class R2(BuiltinMetric):

128

"""R-squared coefficient of determination."""

129

130

class MSLE(BuiltinMetric):

131

"""Mean Squared Logarithmic Error for regression."""

132

133

# Ranking Metrics

134

class NDCG(BuiltinMetric):

135

"""Normalized Discounted Cumulative Gain for ranking."""

136

137

class DCG(BuiltinMetric):

138

"""Discounted Cumulative Gain for ranking."""

139

140

class MAP(BuiltinMetric):

141

"""Mean Average Precision for ranking."""

142

143

class MRR(BuiltinMetric):

144

"""Mean Reciprocal Rank for ranking."""

145

146

class ERR(BuiltinMetric):

147

"""Expected Reciprocal Rank for ranking."""

148

```

149

150

## Metric Usage Examples

151

152

### Basic Metric Evaluation

153

154

```python

155

from catboost import metrics

156

import numpy as np

157

158

# Create sample data

159

y_true = np.array([0, 1, 1, 0, 1])

160

y_pred = np.array([0.1, 0.8, 0.7, 0.3, 0.9])

161

162

# Initialize and evaluate classification metrics

163

logloss = metrics.Logloss()

164

accuracy = metrics.Accuracy()

165

auc = metrics.AUC()

166

167

# Evaluate metrics

168

logloss_value = logloss.eval(y_true, y_pred)

169

accuracy_value = accuracy.eval(y_true, y_pred > 0.5)

170

auc_value = auc.eval(y_true, y_pred)

171

172

print(f"LogLoss: {logloss_value:.4f}")

173

print(f"Accuracy: {accuracy_value:.4f}")

174

print(f"AUC: {auc_value:.4f}")

175

176

# Check optimization direction

177

print(f"LogLoss should be minimized: {logloss.is_min_optimal()}")

178

print(f"AUC should be maximized: {auc.is_max_optimal()}")

179

```

180

181

### Regression Metrics

182

183

```python

184

from catboost import metrics

185

import numpy as np

186

187

# Sample regression data

188

y_true = np.array([1.0, 2.0, 3.0, 4.0, 5.0])

189

y_pred = np.array([1.1, 2.2, 2.8, 4.2, 4.8])

190

191

# Initialize regression metrics

192

rmse = metrics.RMSE()

193

mae = metrics.MAE()

194

r2 = metrics.R2()

195

196

# Evaluate metrics

197

rmse_value = rmse.eval(y_true, y_pred)

198

mae_value = mae.eval(y_true, y_pred)

199

r2_value = r2.eval(y_true, y_pred)

200

201

print(f"RMSE: {rmse_value:.4f}")

202

print(f"MAE: {mae_value:.4f}")

203

print(f"R²: {r2_value:.4f}")

204

205

# Get metric parameters

206

print(f"RMSE parameters: {rmse.params_with_defaults()}")

207

```

208

209

### Ranking Metrics with Groups

210

211

```python

212

from catboost import metrics

213

import numpy as np

214

215

# Sample ranking data

216

y_true = np.array([2, 1, 0, 3, 1, 2]) # Relevance scores

217

y_pred = np.array([0.8, 0.6, 0.3, 0.9, 0.5, 0.7]) # Predictions

218

group_ids = np.array([0, 0, 0, 1, 1, 1]) # Query groups

219

220

# Initialize ranking metrics

221

ndcg = metrics.NDCG()

222

dcg = metrics.DCG()

223

map_metric = metrics.MAP()

224

225

# Evaluate with group information

226

ndcg_value = ndcg.eval(y_true, y_pred, group_id=group_ids)

227

dcg_value = dcg.eval(y_true, y_pred, group_id=group_ids)

228

map_value = map_metric.eval(y_true, y_pred, group_id=group_ids)

229

230

print(f"NDCG: {ndcg_value:.4f}")

231

print(f"DCG: {dcg_value:.4f}")

232

print(f"MAP: {map_value:.4f}")

233

```

234

235

### Weighted Metric Evaluation

236

237

```python

238

from catboost import metrics

239

import numpy as np

240

241

# Data with sample weights

242

y_true = np.array([0, 1, 1, 0, 1])

243

y_pred = np.array([0.1, 0.8, 0.7, 0.3, 0.9])

244

weights = np.array([1.0, 2.0, 1.5, 1.0, 2.5]) # Sample importance

245

246

# Initialize metrics

247

logloss = metrics.Logloss()

248

precision = metrics.Precision()

249

250

# Evaluate with weights

251

weighted_logloss = logloss.eval(y_true, y_pred, weight=weights)

252

weighted_precision = precision.eval(y_true, y_pred > 0.5, weight=weights)

253

254

print(f"Weighted LogLoss: {weighted_logloss:.4f}")

255

print(f"Weighted Precision: {weighted_precision:.4f}")

256

```

257

258

### Custom Metric Configuration

259

260

```python

261

from catboost import metrics

262

263

# Initialize metric with specific parameters

264

# (Parameter availability depends on metric type)

265

auc_metric = metrics.AUC()

266

f1_metric = metrics.F1()

267

268

# Set hints for metric behavior

269

auc_metric.set_hints(skip_train=True)

270

f1_metric.set_hints(use_weights=True)

271

272

# Get string representation with parameters

273

print(f"AUC metric: {auc_metric}")

274

print(f"F1 metric: {f1_metric}")

275

276

# Check available parameters

277

print(f"AUC parameters: {auc_metric.params_with_defaults()}")

278

```

279

280

### Multi-threaded Evaluation

281

282

```python

283

from catboost import metrics

284

import numpy as np

285

286

# Large dataset simulation

287

np.random.seed(42)

288

n_samples = 100000

289

y_true = np.random.randint(0, 2, n_samples)

290

y_pred = np.random.random(n_samples)

291

292

# Initialize metric

293

auc = metrics.AUC()

294

295

# Evaluate with multiple threads for large datasets

296

auc_value = auc.eval(y_true, y_pred, thread_count=4)

297

print(f"AUC (4 threads): {auc_value:.6f}")

298

299

# Compare with single-threaded evaluation

300

auc_single = auc.eval(y_true, y_pred, thread_count=1)

301

print(f"AUC (1 thread): {auc_single:.6f}")

302

```

303

304

## Integration with CatBoost Models

305

306

The metrics framework integrates seamlessly with CatBoost model training and evaluation:

307

308

```python

309

from catboost import CatBoostClassifier, metrics

310

import numpy as np

311

312

# Create model with custom metric

313

model = CatBoostClassifier(

314

iterations=100,

315

eval_metric='AUC', # Use built-in metric name

316

verbose=False

317

)

318

319

# Train model

320

model.fit(X_train, y_train, eval_set=(X_test, y_test))

321

322

# Manual metric evaluation

323

auc_metric = metrics.AUC()

324

predictions = model.predict_proba(X_test)[:, 1]

325

manual_auc = auc_metric.eval(y_test, predictions)

326

327

print(f"Manual AUC calculation: {manual_auc:.6f}")

328

329

# Compare with model's built-in evaluation

330

model_metrics = model.get_evals_result()

331

print(f"Model's AUC: {model_metrics['validation']['AUC'][-1]:.6f}")

332

```

333

334

## Available Metric Types

335

336

The CatBoost metrics framework provides extensive coverage across machine learning tasks:

337

338

- **Classification**: Logloss, CrossEntropy, Accuracy, Precision, Recall, F1, AUC, MultiClass, and variants

339

- **Regression**: RMSE, MAE, MAPE, R2, MSLE, MedianAbsoluteError, SMAPE, and variants

340

- **Ranking**: NDCG, DCG, MAP, MRR, ERR, and variants with different parameters

341

- **Multi-target**: Specialized metrics for multi-output problems

342

343

Each metric type may have multiple variants with different default parameters, all dynamically generated from the underlying CatBoost implementation.