0
# Probability Scores
1
2
Metrics for evaluating probability forecasts, ensemble forecasts, and distributional predictions. Includes Brier score for binary events, CRPS (Continuous Ranked Probability Score) for continuous distributions, and threshold-weighted variants for focused evaluation.
3
4
## Capabilities
5
6
### Brier Score
7
8
The fundamental proper scoring rule for evaluating binary probability forecasts.
9
10
#### Basic Brier Score
11
12
```python { .api }
13
def brier_score(
14
fcst: XarrayLike,
15
obs: XarrayLike,
16
*,
17
reduce_dims: Optional[FlexibleDimensionTypes] = None,
18
preserve_dims: Optional[FlexibleDimensionTypes] = None,
19
weights: Optional[xr.DataArray] = None,
20
check_args: bool = True,
21
) -> XarrayLike:
22
"""
23
Calculate Brier Score for probability forecasts.
24
25
Args:
26
fcst: Probability forecasts [0, 1]
27
obs: Binary observations {0, 1}
28
reduce_dims: Dimensions to reduce
29
preserve_dims: Dimensions to preserve
30
weights: Optional weights
31
check_args: Validate input data ranges
32
33
Returns:
34
Brier scores
35
36
Formula:
37
BS = (1/n) * Σ(forecast_i - observed_i)²
38
39
Notes:
40
- Perfect forecast has BS = 0
41
- Range: [0, 1]
42
- Lower scores indicate better performance
43
- Forecasts must be probabilities [0, 1]
44
- Observations must be binary {0, 1}
45
"""
46
```
47
48
#### Ensemble Brier Score
49
50
Brier score calculation for ensemble forecasts with optional fair correction.
51
52
```python { .api }
53
def brier_score_for_ensemble(
54
fcst: XarrayLike,
55
obs: XarrayLike,
56
ensemble_member_dim: str,
57
event_thresholds: Union[Real, Sequence[Real]],
58
*,
59
reduce_dims: Optional[FlexibleDimensionTypes] = None,
60
preserve_dims: Optional[FlexibleDimensionTypes] = None,
61
weights: Optional[xr.DataArray] = None,
62
fair_correction: bool = True,
63
event_threshold_operator: Callable = operator.ge,
64
threshold_dim: str = "threshold",
65
) -> XarrayLike:
66
"""
67
Calculate Brier Score for ensemble forecasts.
68
69
Args:
70
fcst: Ensemble forecast data
71
obs: Observation data
72
ensemble_member_dim: Name of ensemble member dimension
73
event_thresholds: Threshold values for binary events
74
reduce_dims: Dimensions to reduce
75
preserve_dims: Dimensions to preserve
76
weights: Optional weights
77
fair_correction: Apply fair correction for finite ensemble size
78
event_threshold_operator: Comparison operator (ge, le, gt, lt)
79
threshold_dim: Name for threshold dimension in output
80
81
Returns:
82
Brier scores for each threshold
83
84
Notes:
85
- Converts ensemble to probabilities using threshold exceedance
86
- Fair correction accounts for finite ensemble size effects
87
- Multiple thresholds evaluated simultaneously
88
"""
89
```
90
91
**Usage Example:**
92
93
```python
94
from scores.probability import brier_score, brier_score_for_ensemble
95
import xarray as xr
96
import numpy as np
97
98
# Basic Brier score for probability forecasts
99
prob_forecast = xr.DataArray([0.8, 0.6, 0.3, 0.1, 0.9])
100
binary_obs = xr.DataArray([1, 1, 0, 0, 1])
101
bs = brier_score(prob_forecast, binary_obs)
102
103
# Ensemble Brier score
104
ensemble_fcst = xr.DataArray(
105
np.random.normal(10, 2, (100, 20)), # 100 time steps, 20 members
106
dims=["time", "member"]
107
)
108
obs = xr.DataArray(np.random.normal(10, 2, 100), dims=["time"])
109
thresholds = [8, 10, 12, 15]
110
111
ensemble_bs = brier_score_for_ensemble(
112
ensemble_fcst, obs, "member", thresholds
113
)
114
```
115
116
### Continuous Ranked Probability Score (CRPS)
117
118
The extension of Brier score to continuous distributions, evaluating the full probabilistic forecast.
119
120
#### CRPS for CDF Forecasts
121
122
```python { .api }
123
def crps_cdf(
124
fcst: xr.DataArray,
125
obs: xr.DataArray,
126
threshold_dim: str,
127
*,
128
threshold_weight: Optional[xr.DataArray] = None,
129
additional_thresholds: Optional[xr.DataArray] = None,
130
fcst_fill_method: str = "linear",
131
threshold_weight_fill_method: str = "forward",
132
integration_method: str = "exact",
133
reduce_dims: Optional[FlexibleDimensionTypes] = None,
134
preserve_dims: Optional[FlexibleDimensionTypes] = None,
135
weights: Optional[xr.DataArray] = None,
136
) -> xr.DataArray:
137
"""
138
Calculate CRPS for CDF forecasts.
139
140
Args:
141
fcst: CDF forecast values [0, 1]
142
obs: Observation values
143
threshold_dim: Name of threshold dimension in CDF
144
threshold_weight: Optional threshold weighting function
145
additional_thresholds: Additional evaluation thresholds
146
fcst_fill_method: Method for interpolating CDF ("linear", "step")
147
threshold_weight_fill_method: Weight interpolation method
148
integration_method: Integration approach ("exact", "trapz")
149
reduce_dims: Dimensions to reduce
150
preserve_dims: Dimensions to preserve
151
weights: Optional weights
152
153
Returns:
154
CRPS values
155
156
Notes:
157
- Evaluates complete probabilistic forecast
158
- CDF must be monotonically increasing
159
- Threshold dimension contains evaluation points
160
- Lower scores indicate better performance
161
"""
162
```
163
164
#### CRPS for Ensemble Forecasts
165
166
```python { .api }
167
def crps_for_ensemble(
168
fcst: xr.DataArray,
169
obs: xr.DataArray,
170
ensemble_member_dim: str,
171
*,
172
method: str = "closed_form",
173
reduce_dims: Optional[FlexibleDimensionTypes] = None,
174
preserve_dims: Optional[FlexibleDimensionTypes] = None,
175
weights: Optional[xr.DataArray] = None,
176
) -> xr.DataArray:
177
"""
178
Calculate CRPS for ensemble forecasts.
179
180
Args:
181
fcst: Ensemble forecast data
182
obs: Observation data
183
ensemble_member_dim: Name of ensemble member dimension
184
method: Calculation method ("closed_form", "fair")
185
reduce_dims: Dimensions to reduce
186
preserve_dims: Dimensions to preserve
187
weights: Optional weights
188
189
Returns:
190
CRPS values
191
192
Formula (closed form):
193
CRPS = E|X - Y| - 0.5 * E|X - X'|
194
195
Where:
196
- X: forecast distribution
197
- Y: observation
198
- X': independent copy of X
199
200
Notes:
201
- "closed_form": Exact calculation for ensembles
202
- "fair": Applies fair correction for finite ensemble size
203
- Computational complexity: O(n log n) where n = ensemble size
204
"""
205
```
206
207
#### CRPS CDF Brier Decomposition
208
209
Decomposes CRPS-CDF into reliability and resolution components.
210
211
```python { .api }
212
def crps_cdf_brier_decomposition(
213
fcst: xr.DataArray,
214
obs: xr.DataArray,
215
threshold_dim: str,
216
*,
217
reduce_dims: Optional[FlexibleDimensionTypes] = None,
218
preserve_dims: Optional[FlexibleDimensionTypes] = None,
219
weights: Optional[xr.DataArray] = None,
220
) -> xr.DataArray:
221
"""
222
Calculate CRPS-CDF with Brier decomposition.
223
224
Args:
225
fcst: CDF forecast values
226
obs: Observation values
227
threshold_dim: Name of threshold dimension
228
reduce_dims: Dimensions to reduce
229
preserve_dims: Dimensions to preserve
230
weights: Optional weights
231
232
Returns:
233
Dataset with CRPS and decomposition components
234
235
Components:
236
- crps: Total CRPS score
237
- reliability: Reliability component (smaller is better)
238
- resolution: Resolution component (larger is better)
239
- uncertainty: Uncertainty component (climatological)
240
"""
241
```
242
243
### Threshold-Weighted CRPS
244
245
CRPS variants that focus evaluation on specific value ranges or extremes.
246
247
#### Basic Threshold-Weighted CRPS
248
249
```python { .api }
250
def tw_crps_for_ensemble(
251
fcst: xr.DataArray,
252
obs: xr.DataArray,
253
ensemble_member_dim: str,
254
threshold_weight: xr.DataArray,
255
*,
256
reduce_dims: Optional[FlexibleDimensionTypes] = None,
257
preserve_dims: Optional[FlexibleDimensionTypes] = None,
258
weights: Optional[xr.DataArray] = None,
259
) -> xr.DataArray:
260
"""
261
Calculate threshold-weighted CRPS for ensemble forecasts.
262
263
Args:
264
fcst: Ensemble forecast data
265
obs: Observation data
266
ensemble_member_dim: Name of ensemble member dimension
267
threshold_weight: Weight function over threshold values
268
reduce_dims: Dimensions to reduce
269
preserve_dims: Dimensions to preserve
270
weights: Optional weights
271
272
Returns:
273
Threshold-weighted CRPS values
274
275
Notes:
276
- Emphasizes specific value ranges via weighting
277
- Weight function must be non-negative
278
- Reduces to standard CRPS when weights are uniform
279
- Used for extreme value evaluation
280
"""
281
```
282
283
#### Tail-Weighted CRPS
284
285
Focuses evaluation on extreme values (tails of the distribution).
286
287
```python { .api }
288
def tail_tw_crps_for_ensemble(
289
fcst: xr.DataArray,
290
obs: xr.DataArray,
291
ensemble_member_dim: str,
292
tail_weight: float,
293
*,
294
reduce_dims: Optional[FlexibleDimensionTypes] = None,
295
preserve_dims: Optional[FlexibleDimensionTypes] = None,
296
weights: Optional[xr.DataArray] = None,
297
) -> xr.DataArray:
298
"""
299
Calculate tail-weighted CRPS for extreme values.
300
301
Args:
302
fcst: Ensemble forecast data
303
obs: Observation data
304
ensemble_member_dim: Name of ensemble member dimension
305
tail_weight: Weight parameter for tail emphasis (> 0)
306
reduce_dims: Dimensions to reduce
307
preserve_dims: Dimensions to preserve
308
weights: Optional weights
309
310
Returns:
311
Tail-weighted CRPS values
312
313
Notes:
314
- Higher tail_weight emphasizes extreme values more
315
- tail_weight = 0 reduces to standard CRPS
316
- Useful for evaluating forecast skill for extreme events
317
"""
318
```
319
320
#### Interval-Weighted CRPS
321
322
Focuses evaluation on a specific value range.
323
324
```python { .api }
325
def interval_tw_crps_for_ensemble(
326
fcst: xr.DataArray,
327
obs: xr.DataArray,
328
ensemble_member_dim: str,
329
lower_threshold: float,
330
upper_threshold: float,
331
*,
332
reduce_dims: Optional[FlexibleDimensionTypes] = None,
333
preserve_dims: Optional[FlexibleDimensionTypes] = None,
334
weights: Optional[xr.DataArray] = None,
335
) -> xr.DataArray:
336
"""
337
Calculate interval-weighted CRPS for specific value range.
338
339
Args:
340
fcst: Ensemble forecast data
341
obs: Observation data
342
ensemble_member_dim: Name of ensemble member dimension
343
lower_threshold: Lower bound of evaluation interval
344
upper_threshold: Upper bound of evaluation interval
345
reduce_dims: Dimensions to reduce
346
preserve_dims: Dimensions to preserve
347
weights: Optional weights
348
349
Returns:
350
Interval-weighted CRPS values
351
352
Notes:
353
- Only values within [lower_threshold, upper_threshold] contribute
354
- Useful for evaluating specific ranges (e.g., moderate rainfall)
355
- Outside interval, weight = 0
356
"""
357
```
358
359
### CDF Processing Utilities
360
361
Utilities for preparing and manipulating CDF forecasts.
362
363
#### Forecast Adjustment for CRPS
364
365
```python { .api }
366
def adjust_fcst_for_crps(
367
fcst: xr.DataArray,
368
threshold_dim: str,
369
*,
370
threshold_weight: Optional[xr.DataArray] = None,
371
additional_thresholds: Optional[xr.DataArray] = None,
372
fcst_fill_method: str = "linear",
373
threshold_weight_fill_method: str = "forward",
374
) -> xr.DataArray:
375
"""
376
Prepare forecast CDF for CRPS calculation.
377
378
Args:
379
fcst: Raw CDF forecast data
380
threshold_dim: Name of threshold dimension
381
threshold_weight: Optional threshold weighting
382
additional_thresholds: Additional threshold points
383
fcst_fill_method: CDF interpolation method
384
threshold_weight_fill_method: Weight interpolation method
385
386
Returns:
387
Processed CDF forecast ready for CRPS calculation
388
389
Notes:
390
- Ensures CDF is properly formatted and monotonic
391
- Handles missing values and interpolation
392
- Adds additional threshold points if needed
393
"""
394
```
395
396
#### Step Threshold Weighting
397
398
Creates step-function threshold weights for CRPS.
399
400
```python { .api }
401
def crps_step_threshold_weight(
402
thresholds: xr.DataArray,
403
threshold_bins: Sequence[float],
404
) -> xr.DataArray:
405
"""
406
Create step-function threshold weights.
407
408
Args:
409
thresholds: Threshold values for CDF
410
threshold_bins: Bin edges for step function
411
412
Returns:
413
Step-function weights matching threshold dimension
414
415
Notes:
416
- Creates piecewise constant weighting
417
- Each bin can have different weight
418
- Used for interval-based evaluation emphasis
419
"""
420
```
421
422
## Usage Patterns
423
424
### Basic Probabilistic Evaluation
425
426
```python
427
from scores.probability import brier_score, crps_for_ensemble
428
from scores.sample_data import simple_forecast, simple_observations
429
import numpy as np
430
431
# Binary probability forecast evaluation
432
prob_forecast = np.random.beta(2, 2, 100) # Probabilities [0,1]
433
binary_obs = np.random.binomial(1, prob_forecast) # Binary outcomes
434
435
bs = brier_score(prob_forecast, binary_obs)
436
print(f"Brier Score: {bs.values:.4f}")
437
438
# Ensemble CRPS evaluation
439
ensemble = np.random.normal(10, 2, (100, 20)) # 100 times, 20 members
440
observations = np.random.normal(10, 2, 100)
441
442
crps = crps_for_ensemble(ensemble, observations, ensemble_member_dim="member")
443
print(f"CRPS: {crps.values:.4f}")
444
```
445
446
### Multi-threshold Evaluation
447
448
```python
449
# Evaluate multiple thresholds simultaneously
450
thresholds = [5, 10, 15, 20, 25]
451
ensemble_bs = brier_score_for_ensemble(
452
ensemble_forecast, observations,
453
ensemble_member_dim="member",
454
event_thresholds=thresholds
455
)
456
457
# Results have threshold dimension
458
for i, thresh in enumerate(thresholds):
459
score = ensemble_bs.isel(threshold=i)
460
print(f"Threshold {thresh}: BS = {score.values:.4f}")
461
```
462
463
### Extreme Value Focus
464
465
```python
466
# Emphasize extreme values using tail-weighted CRPS
467
tail_crps = tail_tw_crps_for_ensemble(
468
ensemble_forecast, observations,
469
ensemble_member_dim="member",
470
tail_weight=2.0 # Strong emphasis on extremes
471
)
472
473
# Focus on specific range (e.g., moderate rainfall 5-15mm)
474
interval_crps = interval_tw_crps_for_ensemble(
475
ensemble_forecast, observations,
476
ensemble_member_dim="member",
477
lower_threshold=5.0,
478
upper_threshold=15.0
479
)
480
481
print(f"Standard CRPS: {crps.values:.4f}")
482
print(f"Tail-weighted CRPS: {tail_crps.values:.4f}")
483
print(f"Interval CRPS (5-15): {interval_crps.values:.4f}")
484
```
485
486
### CDF Forecast Evaluation
487
488
```python
489
# For CDF forecasts (probability vs threshold)
490
from scores.sample_data import cdf_forecast, cdf_observations
491
492
cdf_fcst = cdf_forecast() # CDF values at different thresholds
493
cdf_obs = cdf_observations() # Corresponding observations
494
495
# Standard CRPS for CDF
496
cdf_crps = crps_cdf(cdf_fcst, cdf_obs, threshold_dim="threshold")
497
498
# With decomposition
499
decomp = crps_cdf_brier_decomposition(
500
cdf_fcst, cdf_obs, threshold_dim="threshold"
501
)
502
503
print(f"CDF CRPS: {cdf_crps.values:.4f}")
504
print(f"Reliability: {decomp.reliability.values:.4f}")
505
print(f"Resolution: {decomp.resolution.values:.4f}")
506
```