or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

categorical-scores.mdcontinuous-scores.mdemerging-scores.mdindex.mdpandas-integration.mdplot-data.mdprobability-scores.mdprocessing-tools.mdsample-data.mdspatial-scores.mdstatistical-tests.md

plot-data.mddocs/

0

# Plot Data

1

2

Functions for generating data structures needed for diagnostic plots and visualization in forecast verification. These tools create plot-ready data for Murphy diagrams, Q-Q plots, ROC curves, and other verification visualizations.

3

4

## Capabilities

5

6

### Murphy Diagram Data

7

8

Murphy diagrams provide comprehensive visualization of forecast performance across all possible decision thresholds, showing the relationship between forecast value and expected score.

9

10

#### Murphy Score Calculation

11

12

Generates the core data for Murphy diagram plotting.

13

14

```python { .api }

15

def murphy_score(

16

fcst: XarrayLike,

17

obs: XarrayLike,

18

*,

19

reduce_dims: Optional[FlexibleDimensionTypes] = None,

20

preserve_dims: Optional[FlexibleDimensionTypes] = None,

21

weights: Optional[xr.DataArray] = None,

22

) -> XarrayLike:

23

"""

24

Calculate Murphy score for Murphy diagram generation.

25

26

Args:

27

fcst: Forecast values

28

obs: Observation values

29

reduce_dims: Dimensions to reduce

30

preserve_dims: Dimensions to preserve

31

weights: Optional weights

32

33

Returns:

34

Murphy scores across forecast value thresholds

35

36

Notes:

37

- Creates data for Murphy diagram plotting

38

- Shows expected score as function of forecast value

39

- Useful for understanding forecast value relationships

40

- Typically plotted as scatter or line plot

41

"""

42

```

43

44

#### Murphy Theta Values

45

46

Computes theta parameters for Murphy diagram construction.

47

48

```python { .api }

49

def murphy_thetas(

50

fcst: XarrayLike,

51

obs: XarrayLike,

52

*,

53

reduce_dims: Optional[FlexibleDimensionTypes] = None,

54

preserve_dims: Optional[FlexibleDimensionTypes] = None,

55

weights: Optional[xr.DataArray] = None,

56

) -> XarrayLike:

57

"""

58

Calculate Murphy theta values for Murphy diagrams.

59

60

Args:

61

fcst: Forecast values

62

obs: Observation values

63

reduce_dims: Dimensions to reduce

64

preserve_dims: Dimensions to preserve

65

weights: Optional weights

66

67

Returns:

68

Theta values for Murphy diagram axes

69

70

Notes:

71

- Provides theta parameters for Murphy diagram

72

- Used as x-axis values in Murphy plots

73

- Represents decision threshold parameters

74

"""

75

```

76

77

### Quantile-Quantile Plot Data

78

79

Q-Q plots compare the distributions of forecasts and observations by plotting their quantiles against each other.

80

81

```python { .api }

82

def qq(

83

fcst: XarrayLike,

84

obs: XarrayLike,

85

*,

86

quantiles: Optional[Sequence[float]] = None,

87

reduce_dims: Optional[FlexibleDimensionTypes] = None,

88

preserve_dims: Optional[FlexibleDimensionTypes] = None,

89

) -> xr.Dataset:

90

"""

91

Generate quantile-quantile plot data.

92

93

Args:

94

fcst: Forecast values

95

obs: Observation values

96

quantiles: Quantile levels to compute (default: 0.01 to 0.99 by 0.01)

97

reduce_dims: Dimensions to reduce

98

preserve_dims: Dimensions to preserve

99

100

Returns:

101

Dataset with forecast and observation quantiles

102

103

Output Variables:

104

- fcst_quantiles: Forecast quantile values

105

- obs_quantiles: Observation quantile values

106

- quantile_levels: Quantile probability levels

107

108

Notes:

109

- Perfect forecast would show fcst_quantiles = obs_quantiles

110

- Deviations indicate distributional biases

111

- Useful for identifying systematic forecast errors

112

- Default quantiles: [0.01, 0.02, ..., 0.99]

113

"""

114

```

115

116

### ROC Curve Data

117

118

ROC (Receiver Operating Characteristic) curves evaluate binary classification performance across all probability thresholds.

119

120

```python { .api }

121

def roc(

122

fcst: XarrayLike,

123

obs: XarrayLike,

124

*,

125

bin_edges: Union[str, Sequence[float]] = "continuous",

126

reduce_dims: Optional[FlexibleDimensionTypes] = None,

127

preserve_dims: Optional[FlexibleDimensionTypes] = None,

128

weights: Optional[xr.DataArray] = None,

129

) -> xr.Dataset:

130

"""

131

Generate ROC curve data for binary classification.

132

133

Args:

134

fcst: Probability forecast values [0, 1] or continuous values

135

obs: Binary observation values {0, 1}

136

bin_edges: Bin edges for probability discretization or "continuous"

137

reduce_dims: Dimensions to reduce

138

preserve_dims: Dimensions to preserve

139

weights: Optional weights

140

141

Returns:

142

Dataset with ROC curve coordinates and metrics

143

144

Output Variables:

145

- pod: Probability of Detection (True Positive Rate)

146

- pofd: Probability of False Detection (False Positive Rate)

147

- threshold: Probability thresholds used

148

- auc: Area Under Curve (if computed)

149

150

Notes:

151

- POD (y-axis) vs POFD (x-axis) creates ROC curve

152

- Perfect forecast: ROC curve passes through (0,1)

153

- No skill: ROC curve follows diagonal (0,0) to (1,1)

154

- Area Under Curve (AUC): scalar skill measure

155

- bin_edges="continuous" uses unique forecast values

156

"""

157

```

158

159

## Usage Patterns

160

161

### Basic Murphy Diagram

162

163

```python

164

from scores.plotdata import murphy_score, murphy_thetas

165

import matplotlib.pyplot as plt

166

import numpy as np

167

168

# Generate sample data

169

forecast = np.random.normal(10, 3, 1000)

170

observation = forecast + np.random.normal(0, 1, 1000)

171

172

# Calculate Murphy diagram data

173

murphy_scores = murphy_score(forecast, observation)

174

theta_values = murphy_thetas(forecast, observation)

175

176

# Create Murphy diagram plot

177

plt.figure(figsize=(10, 6))

178

plt.scatter(theta_values, murphy_scores, alpha=0.6)

179

plt.xlabel('Theta (Decision Threshold)')

180

plt.ylabel('Murphy Score')

181

plt.title('Murphy Diagram')

182

plt.grid(True)

183

plt.show()

184

```

185

186

### Quantile-Quantile Analysis

187

188

```python

189

from scores.plotdata import qq

190

import matplotlib.pyplot as plt

191

192

# Generate forecast and observation data with different distributions

193

forecast = np.random.gamma(2, 2, 1000) # Gamma distribution

194

observation = np.random.exponential(3, 1000) # Exponential distribution

195

196

# Calculate Q-Q plot data

197

qq_data = qq(forecast, observation)

198

199

# Extract quantile values

200

fcst_q = qq_data.fcst_quantiles

201

obs_q = qq_data.obs_quantiles

202

203

# Create Q-Q plot

204

plt.figure(figsize=(8, 8))

205

plt.scatter(fcst_q, obs_q, alpha=0.7)

206

plt.plot([0, max(fcst_q.max().values, obs_q.max().values)],

207

[0, max(fcst_q.max().values, obs_q.max().values)],

208

'r--', label='Perfect Forecast Line')

209

plt.xlabel('Forecast Quantiles')

210

plt.ylabel('Observation Quantiles')

211

plt.title('Quantile-Quantile Plot')

212

plt.legend()

213

plt.grid(True)

214

plt.show()

215

216

print(f"Q-Q data contains {len(qq_data.quantile_levels)} quantile levels")

217

```

218

219

### ROC Curve Analysis

220

221

```python

222

from scores.plotdata import roc

223

import matplotlib.pyplot as plt

224

225

# Create probability forecast and binary observations

226

prob_forecast = np.random.beta(2, 3, 1000) # Probability values [0,1]

227

# Generate observations with some skill

228

binary_obs = np.random.binomial(1, 0.3 + 0.4 * prob_forecast)

229

230

# Calculate ROC curve data

231

roc_data = roc(prob_forecast, binary_obs)

232

233

# Extract ROC coordinates

234

pod_values = roc_data.pod

235

pofd_values = roc_data.pofd

236

auc_value = roc_data.auc if 'auc' in roc_data else None

237

238

# Create ROC curve plot

239

plt.figure(figsize=(8, 8))

240

plt.plot(pofd_values, pod_values, 'b-', linewidth=2, label='ROC Curve')

241

plt.plot([0, 1], [0, 1], 'r--', label='No Skill Line')

242

plt.xlabel('Probability of False Detection (False Positive Rate)')

243

plt.ylabel('Probability of Detection (True Positive Rate)')

244

plt.title(f'ROC Curve (AUC = {auc_value.values:.3f})' if auc_value else 'ROC Curve')

245

plt.legend()

246

plt.grid(True)

247

plt.axis('equal')

248

plt.xlim([0, 1])

249

plt.ylim([0, 1])

250

plt.show()

251

```

252

253

### Custom Quantile Levels

254

255

```python

256

# Q-Q plot with custom quantile levels

257

custom_quantiles = [0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95]

258

259

qq_custom = qq(forecast, observation, quantiles=custom_quantiles)

260

261

print("Custom quantile analysis:")

262

for i, q_level in enumerate(custom_quantiles):

263

fcst_val = qq_custom.fcst_quantiles.values[i]

264

obs_val = qq_custom.obs_quantiles.values[i]

265

print(f"Q{q_level:4.2f}: forecast={fcst_val:6.2f}, observation={obs_val:6.2f}")

266

```

267

268

### Multi-dimensional Plot Data

269

270

```python

271

# Generate plot data for multi-dimensional arrays

272

forecast_3d = xr.DataArray(

273

np.random.normal(10, 2, (100, 5, 8)), # time, lat, lon

274

dims=["time", "lat", "lon"]

275

)

276

observation_3d = xr.DataArray(

277

forecast_3d.values + np.random.normal(0, 1, (100, 5, 8)),

278

dims=["time", "lat", "lon"]

279

)

280

281

# Q-Q data for each spatial point (reduce time dimension)

282

qq_spatial = qq(forecast_3d, observation_3d, reduce_dims="time")

283

print(f"Spatial Q-Q data shape: {qq_spatial.fcst_quantiles.shape}")

284

285

# Q-Q data for each time step (reduce spatial dimensions)

286

qq_temporal = qq(forecast_3d, observation_3d, reduce_dims=["lat", "lon"])

287

print(f"Temporal Q-Q data shape: {qq_temporal.fcst_quantiles.shape}")

288

289

# Overall Q-Q data (reduce all dimensions)

290

qq_overall = qq(forecast_3d, observation_3d)

291

print(f"Overall Q-Q data points: {len(qq_overall.quantile_levels)}")

292

```

293

294

### ROC Analysis with Binning

295

296

```python

297

# ROC curve with specified probability bins

298

prob_bins = np.linspace(0, 1, 21) # 20 probability bins

299

300

roc_binned = roc(prob_forecast, binary_obs, bin_edges=prob_bins)

301

302

print(f"ROC analysis with {len(prob_bins)-1} probability bins")

303

print(f"Number of ROC points: {len(roc_binned.threshold)}")

304

305

# Plot binned ROC curve

306

plt.figure(figsize=(8, 8))

307

plt.plot(roc_binned.pofd, roc_binned.pod, 'bo-', markersize=4, label='Binned ROC')

308

plt.plot([0, 1], [0, 1], 'r--', label='No Skill')

309

plt.xlabel('POFD')

310

plt.ylabel('POD')

311

plt.title('ROC Curve (Binned Probabilities)')

312

plt.legend()

313

plt.grid(True)

314

plt.show()

315

316

# Show threshold values

317

for i in range(0, len(roc_binned.threshold), 5): # Every 5th threshold

318

thresh = roc_binned.threshold.values[i]

319

pod = roc_binned.pod.values[i]

320

pofd = roc_binned.pofd.values[i]

321

print(f"Threshold {thresh:.2f}: POD={pod:.3f}, POFD={pofd:.3f}")

322

```

323

324

### Comprehensive Visualization Workflow

325

326

```python

327

from scores.plotdata import murphy_score, murphy_thetas, qq, roc

328

import matplotlib.pyplot as plt

329

330

# Sample forecast system evaluation

331

forecast_prob = np.random.beta(3, 2, 2000) # Probability forecasts

332

observations = np.random.binomial(1, forecast_prob * 0.8) # Binary outcomes with some skill

333

334

# Generate all plot data

335

murphy_data = murphy_score(forecast_prob, observations)

336

theta_data = murphy_thetas(forecast_prob, observations)

337

qq_data = qq(forecast_prob, observations)

338

roc_data = roc(forecast_prob, observations)

339

340

# Create comprehensive verification plot

341

fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 12))

342

343

# Murphy diagram

344

ax1.scatter(theta_data, murphy_data, alpha=0.6)

345

ax1.set_xlabel('Theta')

346

ax1.set_ylabel('Murphy Score')

347

ax1.set_title('Murphy Diagram')

348

ax1.grid(True)

349

350

# Q-Q plot

351

ax2.scatter(qq_data.fcst_quantiles, qq_data.obs_quantiles, alpha=0.7)

352

ax2.plot([0, 1], [0, 1], 'r--', label='Perfect Forecast')

353

ax2.set_xlabel('Forecast Quantiles')

354

ax2.set_ylabel('Observation Quantiles')

355

ax2.set_title('Q-Q Plot')

356

ax2.legend()

357

ax2.grid(True)

358

359

# ROC curve

360

ax3.plot(roc_data.pofd, roc_data.pod, 'b-', linewidth=2)

361

ax3.plot([0, 1], [0, 1], 'r--', alpha=0.7)

362

ax3.set_xlabel('POFD')

363

ax3.set_ylabel('POD')

364

ax3.set_title('ROC Curve')

365

ax3.grid(True)

366

367

# Distribution comparison

368

ax4.hist(forecast_prob, bins=20, alpha=0.6, label='Forecast', density=True)

369

ax4.hist(observations, bins=20, alpha=0.6, label='Observation', density=True)

370

ax4.set_xlabel('Value')

371

ax4.set_ylabel('Density')

372

ax4.set_title('Distribution Comparison')

373

ax4.legend()

374

ax4.grid(True)

375

376

plt.tight_layout()

377

plt.show()

378

```

379

380

### Weighted Plot Data

381

382

```python

383

from scores.functions import create_latitude_weights

384

385

# Create spatially-weighted plot data

386

lats = np.linspace(-60, 60, 25)

387

area_weights = create_latitude_weights(lats)

388

389

# Expand weights to match data dimensions

390

weights_3d = area_weights.broadcast_like(forecast_3d.isel(time=0, drop=True))

391

392

# Generate area-weighted plot data

393

qq_weighted = qq(forecast_3d, observation_3d,

394

reduce_dims=["time", "lat", "lon"])

395

396

roc_weighted = roc(forecast_3d, observation_3d.where(observation_3d > 5, 0).where(observation_3d <= 5, 1),

397

reduce_dims=["time", "lat", "lon"],

398

weights=weights_3d)

399

400

print("Area-weighted verification complete")

401

print(f"Weighted Q-Q quantiles: {len(qq_weighted.quantile_levels)}")

402

print(f"Weighted ROC points: {len(roc_weighted.threshold)}")

403

```