0
# Statistical Indicators
1
2
Generic statistical analysis tools applicable to any climate variable. These indicators provide distribution fitting, frequency analysis, spell detection, and other statistical methods for comprehensive climate data analysis.
3
4
## Capabilities
5
6
### Basic Statistical Operations
7
8
Fundamental statistical measures and aggregation functions for climate data analysis.
9
10
```python { .api }
11
def stats(da, op="mean", freq="YS"):
12
"""
13
Statistical operation on climate data.
14
15
Parameters:
16
- da: xr.DataArray, input climate data
17
- op: str, statistical operation ("mean", "max", "min", "std", "var", "sum")
18
- freq: str, resampling frequency (default "YS")
19
20
Returns:
21
xr.DataArray: Statistical result
22
"""
23
24
def daily_statistics(da, op="mean", freq="YS"):
25
"""
26
Daily statistical operations aggregated over time period.
27
28
Parameters:
29
- da: xr.DataArray, daily climate data
30
- op: str or list, statistical operations to apply
31
- freq: str, resampling frequency
32
33
Returns:
34
xr.DataArray or xr.Dataset: Daily statistics
35
"""
36
37
def percentiles(da, values=[10, 50, 90], freq="YS"):
38
"""
39
Calculate percentiles of climate data.
40
41
Parameters:
42
- da: xr.DataArray, input climate data
43
- values: list, percentile values to calculate (0-100)
44
- freq: str, resampling frequency
45
46
Returns:
47
xr.DataArray: Percentile values
48
"""
49
```
50
51
### Distribution Fitting
52
53
Statistical distribution fitting and parameter estimation for climate extremes analysis.
54
55
```python { .api }
56
def fit(da, dist="norm", freq="YS"):
57
"""
58
Fit statistical distribution to climate data.
59
60
Parameters:
61
- da: xr.DataArray, input climate data
62
- dist: str, distribution name ("norm", "gamma", "weibull", "gev", etc.)
63
- freq: str, resampling frequency
64
65
Returns:
66
xr.Dataset: Distribution parameters and goodness-of-fit statistics
67
"""
68
69
def return_level(da, return_period=[2, 5, 10, 20, 50, 100], dist="gev", freq="YS"):
70
"""
71
Calculate return levels for specified return periods.
72
73
Parameters:
74
- da: xr.DataArray, extreme values data (e.g., annual maxima)
75
- return_period: list, return periods in years
76
- dist: str, extreme value distribution ("gev", "weibull", "gamma")
77
- freq: str, resampling frequency
78
79
Returns:
80
xr.DataArray: Return level estimates
81
"""
82
83
def frequency_analysis(da, threshold, above=True, dist="genpareto", freq="YS"):
84
"""
85
Frequency analysis using peaks-over-threshold method.
86
87
Parameters:
88
- da: xr.DataArray, daily climate data
89
- threshold: float, threshold for extreme event detection
90
- above: bool, whether to analyze values above (True) or below (False) threshold
91
- dist: str, distribution for threshold exceedances ("genpareto", "exp")
92
- freq: str, resampling frequency
93
94
Returns:
95
xr.Dataset: Distribution parameters and return level estimates
96
"""
97
98
def standardized_index(da, distribution="gamma", freq="MS", window=1):
99
"""
100
Calculate standardized index (e.g., SPI, SPEI).
101
102
Parameters:
103
- da: xr.DataArray, climate data (e.g., precipitation for SPI)
104
- distribution: str, distribution for fitting ("gamma", "pearson3", "norm")
105
- freq: str, resampling frequency (typically "MS" for monthly)
106
- window: int, accumulation window in months
107
108
Returns:
109
xr.DataArray: Standardized index values
110
"""
111
```
112
113
### Spell and Run Analysis
114
115
Consecutive event detection and characterization for drought, heat waves, and other persistent conditions.
116
117
```python { .api }
118
def spell_length(da, threshold, op=operator.ge, freq="YS"):
119
"""
120
Calculate spell lengths for threshold exceedances.
121
122
Parameters:
123
- da: xr.DataArray, daily climate data
124
- threshold: float, threshold value for spell detection
125
- op: callable, comparison operator (operator.ge, operator.le, etc.)
126
- freq: str, resampling frequency
127
128
Returns:
129
xr.DataArray: Spell length statistics (max, mean, total)
130
"""
131
132
def threshold_count(da, threshold, op=operator.ge, freq="YS"):
133
"""
134
Count threshold exceedances over time period.
135
136
Parameters:
137
- da: xr.DataArray, daily climate data
138
- threshold: float, threshold value
139
- op: callable, comparison operator (default operator.ge for >=)
140
- freq: str, resampling frequency
141
142
Returns:
143
xr.DataArray: Count of threshold exceedances
144
"""
145
146
def run_length(da, window=1, freq="YS"):
147
"""
148
Run length encoding for consecutive identical values.
149
150
Parameters:
151
- da: xr.DataArray, input data (typically boolean)
152
- window: int, minimum run length to consider
153
- freq: str, resampling frequency
154
155
Returns:
156
xr.DataArray: Run length statistics
157
"""
158
159
def longest_run(da, threshold, op=operator.ge, freq="YS"):
160
"""
161
Find longest consecutive run meeting condition.
162
163
Parameters:
164
- da: xr.DataArray, daily climate data
165
- threshold: float, threshold value
166
- op: callable, comparison operator
167
- freq: str, resampling frequency
168
169
Returns:
170
xr.DataArray: Length of longest run
171
"""
172
```
173
174
### Trend and Change Detection
175
176
Trend analysis and change point detection for climate change assessment.
177
178
```python { .api }
179
def trend_slope(da, freq="YS"):
180
"""
181
Calculate linear trend slope using least squares regression.
182
183
Parameters:
184
- da: xr.DataArray, time series climate data
185
- freq: str, resampling frequency for trend calculation
186
187
Returns:
188
xr.DataArray: Trend slope values (units per time)
189
"""
190
191
def mann_kendall_trend(da, alpha=0.05, freq="YS"):
192
"""
193
Mann-Kendall trend test for monotonic trends.
194
195
Parameters:
196
- da: xr.DataArray, time series climate data
197
- alpha: float, significance level (default 0.05)
198
- freq: str, resampling frequency
199
200
Returns:
201
xr.Dataset: Trend statistics (tau, p-value, slope)
202
"""
203
204
def change_point_detection(da, method="pettitt", freq="YS"):
205
"""
206
Detect change points in climate time series.
207
208
Parameters:
209
- da: xr.DataArray, time series climate data
210
- method: str, detection method ("pettitt", "buishand", "snht")
211
- freq: str, resampling frequency
212
213
Returns:
214
xr.Dataset: Change point statistics and location
215
"""
216
```
217
218
### Extreme Value Analysis
219
220
Advanced extreme value statistics for climate risk assessment.
221
222
```python { .api }
223
def block_maxima(da, block_size="YS"):
224
"""
225
Extract block maxima for extreme value analysis.
226
227
Parameters:
228
- da: xr.DataArray, daily climate data
229
- block_size: str, size of blocks for maxima extraction ("YS", "MS", "QS")
230
231
Returns:
232
xr.DataArray: Block maximum values
233
"""
234
235
def peaks_over_threshold(da, threshold, min_separation=1):
236
"""
237
Extract peaks over threshold for extreme value analysis.
238
239
Parameters:
240
- da: xr.DataArray, daily climate data
241
- threshold: float, threshold value for peak detection
242
- min_separation: int, minimum separation between peaks in days
243
244
Returns:
245
xr.DataArray: Peak values above threshold
246
"""
247
248
def extreme_events(da, threshold, duration=1, freq="YS"):
249
"""
250
Identify and characterize extreme events.
251
252
Parameters:
253
- da: xr.DataArray, daily climate data
254
- threshold: float, threshold for extreme event definition
255
- duration: int, minimum duration for event detection
256
- freq: str, resampling frequency
257
258
Returns:
259
xr.Dataset: Event characteristics (count, duration, intensity)
260
"""
261
```
262
263
## Usage Examples
264
265
### Basic Statistical Analysis
266
267
```python
268
import xarray as xr
269
import xclim.generic as xcg
270
import operator
271
272
# Load climate data
273
ds = xr.tutorial.open_dataset("air_temperature")
274
tas = ds.air.rename("tas")
275
276
# Basic statistics
277
annual_mean = xcg.stats(tas, op="mean", freq="YS")
278
annual_max = xcg.stats(tas, op="max", freq="YS")
279
temp_percentiles = xcg.percentiles(tas, values=[5, 25, 75, 95], freq="YS")
280
```
281
282
### Distribution Fitting and Return Levels
283
284
```python
285
# Fit distribution to annual maxima
286
annual_max = xcg.stats(tas, op="max", freq="YS")
287
fit_result = xcg.fit(annual_max, dist="gev")
288
289
# Calculate return levels
290
return_levels = xcg.return_level(
291
annual_max,
292
return_period=[2, 5, 10, 25, 50, 100],
293
dist="gev"
294
)
295
```
296
297
### Spell Analysis
298
299
```python
300
# Heat wave analysis (assuming temperature in Celsius)
301
heat_threshold = 30.0 # 30°C
302
heat_spells = xcg.spell_length(
303
tas,
304
threshold=heat_threshold,
305
op=operator.ge,
306
freq="YS"
307
)
308
309
# Count hot days
310
hot_days = xcg.threshold_count(
311
tas,
312
threshold=heat_threshold,
313
op=operator.ge,
314
freq="YS"
315
)
316
317
# Find longest heat wave
318
longest_heat_wave = xcg.longest_run(
319
tas,
320
threshold=heat_threshold,
321
op=operator.ge,
322
freq="YS"
323
)
324
```
325
326
### Standardized Index Calculation
327
328
```python
329
# Calculate Standardized Precipitation Index (SPI)
330
pr = ds.precip.rename("pr") if "precip" in ds else None
331
if pr is not None:
332
spi_3 = xcg.standardized_index(
333
pr,
334
distribution="gamma",
335
freq="MS",
336
window=3 # 3-month SPI
337
)
338
339
spi_12 = xcg.standardized_index(
340
pr,
341
distribution="gamma",
342
freq="MS",
343
window=12 # 12-month SPI
344
)
345
```
346
347
### Trend Analysis
348
349
```python
350
# Calculate temperature trends
351
temp_trend = xcg.trend_slope(tas, freq="YS")
352
353
# Mann-Kendall trend test
354
mk_result = xcg.mann_kendall_trend(tas, alpha=0.05, freq="YS")
355
significant_trends = mk_result.p_value < 0.05
356
357
# Change point detection
358
change_points = xcg.change_point_detection(tas, method="pettitt", freq="YS")
359
```
360
361
### Extreme Event Analysis
362
363
```python
364
# Extract annual maxima
365
annual_extremes = xcg.block_maxima(tas, block_size="YS")
366
367
# Peaks over threshold analysis
368
threshold = tas.quantile(0.95) # 95th percentile threshold
369
peaks = xcg.peaks_over_threshold(tas, threshold=threshold, min_separation=3)
370
371
# Characterize extreme events
372
extreme_events = xcg.extreme_events(
373
tas,
374
threshold=threshold,
375
duration=3, # At least 3 days
376
freq="YS"
377
)
378
```