0
# Variogram Estimation
1
2
Variogram estimation functions calculate empirical variograms from spatial data to characterize spatial correlation structure. These functions support structured and unstructured data, directional analysis, and various estimator types.
3
4
## Capabilities
5
6
### General Variogram Estimation
7
8
Primary function for calculating empirical variograms from spatial data with flexible binning and estimation options.
9
10
```python { .api }
11
def vario_estimate(pos, field, bin_edges=None, sampling_size=None,
12
sampling_seed=None, estimator='matheron', latlon=False,
13
direction=None, angles=None, angles_tol=np.pi/8, bandwidth=None,
14
no_data=np.nan, mask=np.ma.nomask, mesh_type='unstructured',
15
return_counts=False, mean=None, normalizer=None, trend=None,
16
fit_normalizer=False, geo_scale=RADIAN_SCALE, **std_bins):
17
"""
18
Estimate empirical variogram from spatial data.
19
20
Parameters:
21
- pos (array-like): Spatial positions of data points, shape (dim, n_points)
22
- field (array-like): Field values at positions, shape (n_points,)
23
- bin_edges (array-like): Distance bin edges for variogram calculation
24
- sampling_size (int): Subsample data for large datasets (default: use all)
25
- sampling_seed (int): Random seed for subsampling
26
- estimator (str): Variogram estimator ('matheron', 'cressie')
27
- direction (array-like): Direction for anisotropic analysis
28
- angles (array-like): Directional angles for anisotropic analysis
29
- angles_tol (float): Angular tolerance for directional analysis (default: π/8)
30
- bandwidth (float): Angular bandwidth for directional analysis
31
- no_data (float): No-data value to ignore (default: np.nan)
32
- mask (array-like): Data mask (default: np.ma.nomask)
33
- mesh_type (str): Mesh type ('structured', 'unstructured')
34
- return_counts (bool): Return number of point pairs per bin
35
- mean (float or callable): Mean value or function
36
- normalizer (Normalizer): Data normalization method
37
- trend (callable): Trend function
38
- fit_normalizer (bool): Fit normalizer to data
39
- geo_scale (float): Geographic scaling factor (default: RADIAN_SCALE)
40
- **std_bins: Additional parameters for standard binning
41
42
Returns:
43
- bin_centers (array): Distance bin centers
44
- variogram (array): Empirical variogram values
45
- counts (array): Point pair counts per bin (if return_count=True)
46
"""
47
48
def vario_estimate_axis(field, direction='x', estimator='matheron', no_data=np.nan):
49
"""
50
Estimate directional variogram along specified axis.
51
52
Parameters:
53
- field (array-like): Field values on structured grid
54
- direction (str): Axis direction ('x', 'y', 'z' or 0, 1, 2)
55
- estimator (str): Variogram estimator type ('matheron', 'cressie')
56
- no_data (float): No-data value to ignore (default: np.nan)
57
58
Returns:
59
- bin_centers (array): Distance bin centers
60
- variogram (array): Directional variogram values
61
- counts (array): Point pair counts (if return_count=True)
62
"""
63
64
def vario_estimate_structured(pos, field, direction='x', bin_edges=None,
65
estimator='matheron', return_count=False):
66
"""
67
Estimate variogram on structured grid data.
68
69
Parameters:
70
- pos (list): Grid coordinate arrays [x, y, z]
71
- field (array): Field values on structured grid
72
- direction (str or int): Direction for variogram ('x', 'y', 'z' or 0, 1, 2)
73
- bin_edges (array-like): Distance bin edges
74
- estimator (str): Variogram estimator type
75
- return_count (bool): Return point pair counts
76
77
Returns:
78
- bin_centers (array): Distance bin centers
79
- variogram (array): Structured grid variogram values
80
- counts (array): Point pair counts (if return_count=True)
81
"""
82
83
def vario_estimate_unstructured(pos, field, bin_edges=None, sampling_size=None,
84
estimator='matheron', return_count=False,
85
num_threads=None):
86
"""
87
Estimate variogram from unstructured point data.
88
89
Parameters:
90
- pos (array-like): Point positions, shape (dim, n_points)
91
- field (array-like): Field values, shape (n_points,)
92
- bin_edges (array-like): Distance bin edges
93
- sampling_size (int): Subsample size for large datasets
94
- estimator (str): Variogram estimator type
95
- return_count (bool): Return point pair counts
96
- num_threads (int): Number of parallel threads
97
98
Returns:
99
- bin_centers (array): Distance bin centers
100
- variogram (array): Unstructured variogram values
101
- counts (array): Point pair counts (if return_count=True)
102
"""
103
```
104
105
### Binning Utilities
106
107
Functions for creating appropriate distance bins for variogram estimation.
108
109
```python { .api }
110
def standard_bins(pos=None, dim=2, latlon=False, mesh_type='unstructured',
111
bin_no=None, max_dist=None, geo_scale=RADIAN_SCALE):
112
"""
113
Generate standard distance bins for variogram estimation.
114
115
Parameters:
116
- pos (array-like): Spatial positions, shape (dim, n_points) (default: None)
117
- dim (int): Spatial dimension (default: 2)
118
- latlon (bool): Use geographic coordinates (default: False)
119
- mesh_type (str): Mesh type ('structured', 'unstructured')
120
- bin_no (int): Number of distance bins (default: None for auto)
121
- max_dist (float): Maximum distance for binning (default: auto-detect)
122
- geo_scale (float): Geographic scaling factor (default: RADIAN_SCALE)
123
124
Returns:
125
- bin_edges (array): Distance bin edges, shape (bin_no+1,)
126
"""
127
```
128
129
## Variogram Estimators
130
131
### Matheron Estimator
132
133
Classical variogram estimator (default):
134
135
```
136
γ(h) = 1/(2N(h)) * Σ[Z(xi) - Z(xi+h)]²
137
```
138
139
### Cressie Estimator
140
141
Robust estimator less sensitive to outliers:
142
143
```
144
γ(h) = 1/(2N(h)) * (Σ|Z(xi) - Z(xi+h)|^0.5)^4 / (0.457 + 0.494/N(h))
145
```
146
147
## Usage Examples
148
149
### Basic Variogram Estimation
150
151
```python
152
import gstools as gs
153
import numpy as np
154
155
# Generate sample data
156
model = gs.Exponential(dim=2, var=2.0, len_scale=10.0)
157
srf = gs.SRF(model, seed=12345)
158
159
# Create spatial positions
160
x = np.arange(0, 50, 1.0)
161
y = np.arange(0, 30, 1.0)
162
pos = [np.meshgrid(x, y)[i].flatten() for i in range(2)]
163
field = srf.unstructured(pos)
164
165
# Generate standard bins
166
bin_edges = gs.standard_bins(pos, bin_no=30, max_dist=25)
167
168
# Estimate empirical variogram
169
bin_centers, gamma, counts = gs.vario_estimate(
170
pos, field, bin_edges, return_count=True
171
)
172
173
# Plot results
174
import matplotlib.pyplot as plt
175
plt.plot(bin_centers, gamma, 'o-', label='Empirical')
176
plt.plot(bin_centers, model.variogram(bin_centers), '-', label='Theoretical')
177
plt.xlabel('Distance')
178
plt.ylabel('Variogram')
179
plt.legend()
180
plt.show()
181
```
182
183
### Directional Variogram Analysis
184
185
```python
186
# Estimate variogram in different directions
187
angles = [0, 45, 90, 135] # Degrees
188
bandwidth = 22.5 # Angular bandwidth
189
190
directional_variograms = {}
191
for angle in angles:
192
angle_rad = np.deg2rad(angle)
193
bin_centers, gamma = gs.vario_estimate(
194
pos, field, bin_edges,
195
angles=[angle_rad],
196
bandwidth=np.deg2rad(bandwidth)
197
)
198
directional_variograms[angle] = (bin_centers, gamma)
199
200
# Plot directional variograms
201
for angle, (bins, gamma) in directional_variograms.items():
202
plt.plot(bins, gamma, 'o-', label=f'{angle}°')
203
plt.xlabel('Distance')
204
plt.ylabel('Variogram')
205
plt.legend()
206
plt.show()
207
```
208
209
### Axis-Aligned Variogram
210
211
```python
212
# Variogram along specific axes
213
x_bins, x_gamma = gs.vario_estimate_axis(pos, field, direction=0) # X-direction
214
y_bins, y_gamma = gs.vario_estimate_axis(pos, field, direction=1) # Y-direction
215
216
plt.plot(x_bins, x_gamma, 'o-', label='X-direction')
217
plt.plot(y_bins, y_gamma, 's-', label='Y-direction')
218
plt.xlabel('Distance')
219
plt.ylabel('Variogram')
220
plt.legend()
221
plt.show()
222
```
223
224
### Structured Grid Variogram
225
226
```python
227
# For data on regular grid
228
X, Y = np.meshgrid(x, y)
229
grid_field = srf.structured([x, y])
230
231
# Estimate along grid directions
232
x_bins, x_gamma = gs.vario_estimate_structured(
233
[x, y], grid_field, direction='x'
234
)
235
y_bins, y_gamma = gs.vario_estimate_structured(
236
[x, y], grid_field, direction='y'
237
)
238
```
239
240
### Large Dataset Subsampling
241
242
```python
243
# For very large datasets, use subsampling
244
large_pos = np.random.rand(2, 100000) * 100
245
large_field = srf.unstructured(large_pos)
246
247
# Subsample for variogram estimation
248
bin_centers, gamma = gs.vario_estimate(
249
large_pos, large_field, bin_edges,
250
sampling_size=10000, # Use 10k points for estimation
251
sampling_seed=42 # Reproducible subsampling
252
)
253
```
254
255
### Robust Variogram Estimation
256
257
```python
258
# Use Cressie estimator for data with outliers
259
bin_centers, gamma_matheron = gs.vario_estimate(
260
pos, field, bin_edges, estimator='matheron'
261
)
262
bin_centers, gamma_cressie = gs.vario_estimate(
263
pos, field, bin_edges, estimator='cressie'
264
)
265
266
plt.plot(bin_centers, gamma_matheron, 'o-', label='Matheron')
267
plt.plot(bin_centers, gamma_cressie, 's-', label='Cressie')
268
plt.xlabel('Distance')
269
plt.ylabel('Variogram')
270
plt.legend()
271
plt.show()
272
```
273
274
### Geographic Coordinates
275
276
```python
277
# For lat/lon data
278
lat = np.random.uniform(40, 45, 1000) # Latitude
279
lon = np.random.uniform(-75, -70, 1000) # Longitude
280
geo_pos = [lon, lat] # Note: lon first, then lat
281
geo_field = np.random.randn(1000)
282
283
# Variogram with geographic distances
284
bin_centers, gamma = gs.vario_estimate(
285
geo_pos, geo_field, bin_edges,
286
latlon=True # Use great circle distances
287
)
288
```
289
290
### Anisotropic Analysis
291
292
```python
293
# Define multiple directions for anisotropy analysis
294
n_dirs = 8
295
angles = np.linspace(0, np.pi, n_dirs, endpoint=False)
296
bandwidth = np.pi / (2 * n_dirs)
297
298
# Calculate variogram for each direction
299
aniso_results = []
300
for angle in angles:
301
bins, gamma = gs.vario_estimate(
302
pos, field, bin_edges,
303
angles=[angle],
304
bandwidth=bandwidth,
305
separate_dirs=False
306
)
307
aniso_results.append((np.rad2deg(angle), bins, gamma))
308
309
# Plot anisotropy pattern
310
fig, axes = plt.subplots(2, 4, figsize=(15, 8))
311
for i, (angle, bins, gamma) in enumerate(aniso_results):
312
ax = axes[i//4, i%4]
313
ax.plot(bins, gamma, 'o-')
314
ax.set_title(f'Direction: {angle:.0f}°')
315
ax.set_xlabel('Distance')
316
ax.set_ylabel('Variogram')
317
plt.tight_layout()
318
plt.show()
319
```
320
321
## Advanced Features
322
323
### Parallel Computation
324
325
```python
326
# Use multiple threads for faster computation
327
bin_centers, gamma = gs.vario_estimate(
328
pos, field, bin_edges,
329
num_threads=4 # Use 4 CPU cores
330
)
331
```
332
333
### Custom Binning
334
335
```python
336
# Create custom distance bins
337
# Finer resolution at short distances
338
short_bins = np.linspace(0, 5, 20)
339
long_bins = np.linspace(5, 50, 15)
340
custom_bins = np.concatenate([short_bins, long_bins[1:]])
341
342
bin_centers, gamma = gs.vario_estimate(pos, field, custom_bins)
343
```
344
345
### Variogram Statistics
346
347
```python
348
# Get detailed statistics about variogram estimation
349
bin_centers, gamma, counts = gs.vario_estimate(
350
pos, field, bin_edges, return_count=True
351
)
352
353
# Calculate weights based on pair counts
354
weights = counts / np.sum(counts)
355
356
# Weighted average variogram value
357
weighted_mean = np.average(gamma, weights=weights)
358
359
print(f"Total point pairs: {np.sum(counts)}")
360
print(f"Average pairs per bin: {np.mean(counts)}")
361
print(f"Weighted mean variogram: {weighted_mean:.3f}")
362
```
363
364
## Integration with Model Fitting
365
366
### Automatic Model Fitting
367
368
```python
369
# Estimate variogram and fit model
370
bin_centers, gamma = gs.vario_estimate(pos, field, bin_edges)
371
372
# Fit exponential model to empirical variogram
373
fit_model = gs.Exponential(dim=2)
374
fit_results = fit_model.fit_variogram(bin_centers, gamma)
375
376
print(f"Fitted variance: {fit_model.var:.3f}")
377
print(f"Fitted length scale: {fit_model.len_scale:.3f}")
378
print(f"Fitted nugget: {fit_model.nugget:.3f}")
379
380
# Compare empirical and fitted variograms
381
plt.plot(bin_centers, gamma, 'o', label='Empirical')
382
plt.plot(bin_centers, fit_model.variogram(bin_centers), '-', label='Fitted')
383
plt.xlabel('Distance')
384
plt.ylabel('Variogram')
385
plt.legend()
386
plt.show()
387
```
388
389
## Properties and Return Values
390
391
Variogram estimation functions return:
392
393
- **bin_centers**: Distance values at bin centers
394
- **variogram**: Empirical variogram values at each distance
395
- **counts**: Number of point pairs used for each bin (optional)
396
397
Common properties:
398
- Bin centers represent average distances within each bin
399
- Variogram values use specified estimator formula
400
- Point pair counts indicate reliability of each estimate
401
- Missing values are handled by skipping affected pairs