or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

categorical-scores.mdcontinuous-scores.mdemerging-scores.mdindex.mdpandas-integration.mdplot-data.mdprobability-scores.mdprocessing-tools.mdsample-data.mdspatial-scores.mdstatistical-tests.md

probability-scores.mddocs/

0

# Probability Scores

1

2

Metrics for evaluating probability forecasts, ensemble forecasts, and distributional predictions. Includes Brier score for binary events, CRPS (Continuous Ranked Probability Score) for continuous distributions, and threshold-weighted variants for focused evaluation.

3

4

## Capabilities

5

6

### Brier Score

7

8

The fundamental proper scoring rule for evaluating binary probability forecasts.

9

10

#### Basic Brier Score

11

12

```python { .api }

13

def brier_score(

14

fcst: XarrayLike,

15

obs: XarrayLike,

16

*,

17

reduce_dims: Optional[FlexibleDimensionTypes] = None,

18

preserve_dims: Optional[FlexibleDimensionTypes] = None,

19

weights: Optional[xr.DataArray] = None,

20

check_args: bool = True,

21

) -> XarrayLike:

22

"""

23

Calculate Brier Score for probability forecasts.

24

25

Args:

26

fcst: Probability forecasts [0, 1]

27

obs: Binary observations {0, 1}

28

reduce_dims: Dimensions to reduce

29

preserve_dims: Dimensions to preserve

30

weights: Optional weights

31

check_args: Validate input data ranges

32

33

Returns:

34

Brier scores

35

36

Formula:

37

BS = (1/n) * Σ(forecast_i - observed_i)²

38

39

Notes:

40

- Perfect forecast has BS = 0

41

- Range: [0, 1]

42

- Lower scores indicate better performance

43

- Forecasts must be probabilities [0, 1]

44

- Observations must be binary {0, 1}

45

"""

46

```

47

48

#### Ensemble Brier Score

49

50

Brier score calculation for ensemble forecasts with optional fair correction.

51

52

```python { .api }

53

def brier_score_for_ensemble(

54

fcst: XarrayLike,

55

obs: XarrayLike,

56

ensemble_member_dim: str,

57

event_thresholds: Union[Real, Sequence[Real]],

58

*,

59

reduce_dims: Optional[FlexibleDimensionTypes] = None,

60

preserve_dims: Optional[FlexibleDimensionTypes] = None,

61

weights: Optional[xr.DataArray] = None,

62

fair_correction: bool = True,

63

event_threshold_operator: Callable = operator.ge,

64

threshold_dim: str = "threshold",

65

) -> XarrayLike:

66

"""

67

Calculate Brier Score for ensemble forecasts.

68

69

Args:

70

fcst: Ensemble forecast data

71

obs: Observation data

72

ensemble_member_dim: Name of ensemble member dimension

73

event_thresholds: Threshold values for binary events

74

reduce_dims: Dimensions to reduce

75

preserve_dims: Dimensions to preserve

76

weights: Optional weights

77

fair_correction: Apply fair correction for finite ensemble size

78

event_threshold_operator: Comparison operator (ge, le, gt, lt)

79

threshold_dim: Name for threshold dimension in output

80

81

Returns:

82

Brier scores for each threshold

83

84

Notes:

85

- Converts ensemble to probabilities using threshold exceedance

86

- Fair correction accounts for finite ensemble size effects

87

- Multiple thresholds evaluated simultaneously

88

"""

89

```

90

91

**Usage Example:**

92

93

```python

94

from scores.probability import brier_score, brier_score_for_ensemble

95

import xarray as xr

96

import numpy as np

97

98

# Basic Brier score for probability forecasts

99

prob_forecast = xr.DataArray([0.8, 0.6, 0.3, 0.1, 0.9])

100

binary_obs = xr.DataArray([1, 1, 0, 0, 1])

101

bs = brier_score(prob_forecast, binary_obs)

102

103

# Ensemble Brier score

104

ensemble_fcst = xr.DataArray(

105

np.random.normal(10, 2, (100, 20)), # 100 time steps, 20 members

106

dims=["time", "member"]

107

)

108

obs = xr.DataArray(np.random.normal(10, 2, 100), dims=["time"])

109

thresholds = [8, 10, 12, 15]

110

111

ensemble_bs = brier_score_for_ensemble(

112

ensemble_fcst, obs, "member", thresholds

113

)

114

```

115

116

### Continuous Ranked Probability Score (CRPS)

117

118

The extension of Brier score to continuous distributions, evaluating the full probabilistic forecast.

119

120

#### CRPS for CDF Forecasts

121

122

```python { .api }

123

def crps_cdf(

124

fcst: xr.DataArray,

125

obs: xr.DataArray,

126

threshold_dim: str,

127

*,

128

threshold_weight: Optional[xr.DataArray] = None,

129

additional_thresholds: Optional[xr.DataArray] = None,

130

fcst_fill_method: str = "linear",

131

threshold_weight_fill_method: str = "forward",

132

integration_method: str = "exact",

133

reduce_dims: Optional[FlexibleDimensionTypes] = None,

134

preserve_dims: Optional[FlexibleDimensionTypes] = None,

135

weights: Optional[xr.DataArray] = None,

136

) -> xr.DataArray:

137

"""

138

Calculate CRPS for CDF forecasts.

139

140

Args:

141

fcst: CDF forecast values [0, 1]

142

obs: Observation values

143

threshold_dim: Name of threshold dimension in CDF

144

threshold_weight: Optional threshold weighting function

145

additional_thresholds: Additional evaluation thresholds

146

fcst_fill_method: Method for interpolating CDF ("linear", "step")

147

threshold_weight_fill_method: Weight interpolation method

148

integration_method: Integration approach ("exact", "trapz")

149

reduce_dims: Dimensions to reduce

150

preserve_dims: Dimensions to preserve

151

weights: Optional weights

152

153

Returns:

154

CRPS values

155

156

Notes:

157

- Evaluates complete probabilistic forecast

158

- CDF must be monotonically increasing

159

- Threshold dimension contains evaluation points

160

- Lower scores indicate better performance

161

"""

162

```

163

164

#### CRPS for Ensemble Forecasts

165

166

```python { .api }

167

def crps_for_ensemble(

168

fcst: xr.DataArray,

169

obs: xr.DataArray,

170

ensemble_member_dim: str,

171

*,

172

method: str = "closed_form",

173

reduce_dims: Optional[FlexibleDimensionTypes] = None,

174

preserve_dims: Optional[FlexibleDimensionTypes] = None,

175

weights: Optional[xr.DataArray] = None,

176

) -> xr.DataArray:

177

"""

178

Calculate CRPS for ensemble forecasts.

179

180

Args:

181

fcst: Ensemble forecast data

182

obs: Observation data

183

ensemble_member_dim: Name of ensemble member dimension

184

method: Calculation method ("closed_form", "fair")

185

reduce_dims: Dimensions to reduce

186

preserve_dims: Dimensions to preserve

187

weights: Optional weights

188

189

Returns:

190

CRPS values

191

192

Formula (closed form):

193

CRPS = E|X - Y| - 0.5 * E|X - X'|

194

195

Where:

196

- X: forecast distribution

197

- Y: observation

198

- X': independent copy of X

199

200

Notes:

201

- "closed_form": Exact calculation for ensembles

202

- "fair": Applies fair correction for finite ensemble size

203

- Computational complexity: O(n log n) where n = ensemble size

204

"""

205

```

206

207

#### CRPS CDF Brier Decomposition

208

209

Decomposes CRPS-CDF into reliability and resolution components.

210

211

```python { .api }

212

def crps_cdf_brier_decomposition(

213

fcst: xr.DataArray,

214

obs: xr.DataArray,

215

threshold_dim: str,

216

*,

217

reduce_dims: Optional[FlexibleDimensionTypes] = None,

218

preserve_dims: Optional[FlexibleDimensionTypes] = None,

219

weights: Optional[xr.DataArray] = None,

220

) -> xr.DataArray:

221

"""

222

Calculate CRPS-CDF with Brier decomposition.

223

224

Args:

225

fcst: CDF forecast values

226

obs: Observation values

227

threshold_dim: Name of threshold dimension

228

reduce_dims: Dimensions to reduce

229

preserve_dims: Dimensions to preserve

230

weights: Optional weights

231

232

Returns:

233

Dataset with CRPS and decomposition components

234

235

Components:

236

- crps: Total CRPS score

237

- reliability: Reliability component (smaller is better)

238

- resolution: Resolution component (larger is better)

239

- uncertainty: Uncertainty component (climatological)

240

"""

241

```

242

243

### Threshold-Weighted CRPS

244

245

CRPS variants that focus evaluation on specific value ranges or extremes.

246

247

#### Basic Threshold-Weighted CRPS

248

249

```python { .api }

250

def tw_crps_for_ensemble(

251

fcst: xr.DataArray,

252

obs: xr.DataArray,

253

ensemble_member_dim: str,

254

threshold_weight: xr.DataArray,

255

*,

256

reduce_dims: Optional[FlexibleDimensionTypes] = None,

257

preserve_dims: Optional[FlexibleDimensionTypes] = None,

258

weights: Optional[xr.DataArray] = None,

259

) -> xr.DataArray:

260

"""

261

Calculate threshold-weighted CRPS for ensemble forecasts.

262

263

Args:

264

fcst: Ensemble forecast data

265

obs: Observation data

266

ensemble_member_dim: Name of ensemble member dimension

267

threshold_weight: Weight function over threshold values

268

reduce_dims: Dimensions to reduce

269

preserve_dims: Dimensions to preserve

270

weights: Optional weights

271

272

Returns:

273

Threshold-weighted CRPS values

274

275

Notes:

276

- Emphasizes specific value ranges via weighting

277

- Weight function must be non-negative

278

- Reduces to standard CRPS when weights are uniform

279

- Used for extreme value evaluation

280

"""

281

```

282

283

#### Tail-Weighted CRPS

284

285

Focuses evaluation on extreme values (tails of the distribution).

286

287

```python { .api }

288

def tail_tw_crps_for_ensemble(

289

fcst: xr.DataArray,

290

obs: xr.DataArray,

291

ensemble_member_dim: str,

292

tail_weight: float,

293

*,

294

reduce_dims: Optional[FlexibleDimensionTypes] = None,

295

preserve_dims: Optional[FlexibleDimensionTypes] = None,

296

weights: Optional[xr.DataArray] = None,

297

) -> xr.DataArray:

298

"""

299

Calculate tail-weighted CRPS for extreme values.

300

301

Args:

302

fcst: Ensemble forecast data

303

obs: Observation data

304

ensemble_member_dim: Name of ensemble member dimension

305

tail_weight: Weight parameter for tail emphasis (> 0)

306

reduce_dims: Dimensions to reduce

307

preserve_dims: Dimensions to preserve

308

weights: Optional weights

309

310

Returns:

311

Tail-weighted CRPS values

312

313

Notes:

314

- Higher tail_weight emphasizes extreme values more

315

- tail_weight = 0 reduces to standard CRPS

316

- Useful for evaluating forecast skill for extreme events

317

"""

318

```

319

320

#### Interval-Weighted CRPS

321

322

Focuses evaluation on a specific value range.

323

324

```python { .api }

325

def interval_tw_crps_for_ensemble(

326

fcst: xr.DataArray,

327

obs: xr.DataArray,

328

ensemble_member_dim: str,

329

lower_threshold: float,

330

upper_threshold: float,

331

*,

332

reduce_dims: Optional[FlexibleDimensionTypes] = None,

333

preserve_dims: Optional[FlexibleDimensionTypes] = None,

334

weights: Optional[xr.DataArray] = None,

335

) -> xr.DataArray:

336

"""

337

Calculate interval-weighted CRPS for specific value range.

338

339

Args:

340

fcst: Ensemble forecast data

341

obs: Observation data

342

ensemble_member_dim: Name of ensemble member dimension

343

lower_threshold: Lower bound of evaluation interval

344

upper_threshold: Upper bound of evaluation interval

345

reduce_dims: Dimensions to reduce

346

preserve_dims: Dimensions to preserve

347

weights: Optional weights

348

349

Returns:

350

Interval-weighted CRPS values

351

352

Notes:

353

- Only values within [lower_threshold, upper_threshold] contribute

354

- Useful for evaluating specific ranges (e.g., moderate rainfall)

355

- Outside interval, weight = 0

356

"""

357

```

358

359

### CDF Processing Utilities

360

361

Utilities for preparing and manipulating CDF forecasts.

362

363

#### Forecast Adjustment for CRPS

364

365

```python { .api }

366

def adjust_fcst_for_crps(

367

fcst: xr.DataArray,

368

threshold_dim: str,

369

*,

370

threshold_weight: Optional[xr.DataArray] = None,

371

additional_thresholds: Optional[xr.DataArray] = None,

372

fcst_fill_method: str = "linear",

373

threshold_weight_fill_method: str = "forward",

374

) -> xr.DataArray:

375

"""

376

Prepare forecast CDF for CRPS calculation.

377

378

Args:

379

fcst: Raw CDF forecast data

380

threshold_dim: Name of threshold dimension

381

threshold_weight: Optional threshold weighting

382

additional_thresholds: Additional threshold points

383

fcst_fill_method: CDF interpolation method

384

threshold_weight_fill_method: Weight interpolation method

385

386

Returns:

387

Processed CDF forecast ready for CRPS calculation

388

389

Notes:

390

- Ensures CDF is properly formatted and monotonic

391

- Handles missing values and interpolation

392

- Adds additional threshold points if needed

393

"""

394

```

395

396

#### Step Threshold Weighting

397

398

Creates step-function threshold weights for CRPS.

399

400

```python { .api }

401

def crps_step_threshold_weight(

402

thresholds: xr.DataArray,

403

threshold_bins: Sequence[float],

404

) -> xr.DataArray:

405

"""

406

Create step-function threshold weights.

407

408

Args:

409

thresholds: Threshold values for CDF

410

threshold_bins: Bin edges for step function

411

412

Returns:

413

Step-function weights matching threshold dimension

414

415

Notes:

416

- Creates piecewise constant weighting

417

- Each bin can have different weight

418

- Used for interval-based evaluation emphasis

419

"""

420

```

421

422

## Usage Patterns

423

424

### Basic Probabilistic Evaluation

425

426

```python

427

from scores.probability import brier_score, crps_for_ensemble

428

from scores.sample_data import simple_forecast, simple_observations

429

import numpy as np

430

431

# Binary probability forecast evaluation

432

prob_forecast = np.random.beta(2, 2, 100) # Probabilities [0,1]

433

binary_obs = np.random.binomial(1, prob_forecast) # Binary outcomes

434

435

bs = brier_score(prob_forecast, binary_obs)

436

print(f"Brier Score: {bs.values:.4f}")

437

438

# Ensemble CRPS evaluation

439

ensemble = np.random.normal(10, 2, (100, 20)) # 100 times, 20 members

440

observations = np.random.normal(10, 2, 100)

441

442

crps = crps_for_ensemble(ensemble, observations, ensemble_member_dim="member")

443

print(f"CRPS: {crps.values:.4f}")

444

```

445

446

### Multi-threshold Evaluation

447

448

```python

449

# Evaluate multiple thresholds simultaneously

450

thresholds = [5, 10, 15, 20, 25]

451

ensemble_bs = brier_score_for_ensemble(

452

ensemble_forecast, observations,

453

ensemble_member_dim="member",

454

event_thresholds=thresholds

455

)

456

457

# Results have threshold dimension

458

for i, thresh in enumerate(thresholds):

459

score = ensemble_bs.isel(threshold=i)

460

print(f"Threshold {thresh}: BS = {score.values:.4f}")

461

```

462

463

### Extreme Value Focus

464

465

```python

466

# Emphasize extreme values using tail-weighted CRPS

467

tail_crps = tail_tw_crps_for_ensemble(

468

ensemble_forecast, observations,

469

ensemble_member_dim="member",

470

tail_weight=2.0 # Strong emphasis on extremes

471

)

472

473

# Focus on specific range (e.g., moderate rainfall 5-15mm)

474

interval_crps = interval_tw_crps_for_ensemble(

475

ensemble_forecast, observations,

476

ensemble_member_dim="member",

477

lower_threshold=5.0,

478

upper_threshold=15.0

479

)

480

481

print(f"Standard CRPS: {crps.values:.4f}")

482

print(f"Tail-weighted CRPS: {tail_crps.values:.4f}")

483

print(f"Interval CRPS (5-15): {interval_crps.values:.4f}")

484

```

485

486

### CDF Forecast Evaluation

487

488

```python

489

# For CDF forecasts (probability vs threshold)

490

from scores.sample_data import cdf_forecast, cdf_observations

491

492

cdf_fcst = cdf_forecast() # CDF values at different thresholds

493

cdf_obs = cdf_observations() # Corresponding observations

494

495

# Standard CRPS for CDF

496

cdf_crps = crps_cdf(cdf_fcst, cdf_obs, threshold_dim="threshold")

497

498

# With decomposition

499

decomp = crps_cdf_brier_decomposition(

500

cdf_fcst, cdf_obs, threshold_dim="threshold"

501

)

502

503

print(f"CDF CRPS: {cdf_crps.values:.4f}")

504

print(f"Reliability: {decomp.reliability.values:.4f}")

505

print(f"Resolution: {decomp.resolution.values:.4f}")

506

```