0
# Spatial Operations
1
2
Clipping, reprojection, padding, and geometric operations on raster data. These functions enable spatial analysis, coordinate transformations, and geometric manipulation of raster datasets.
3
4
## Capabilities
5
6
### Reprojection
7
8
Transform raster data from one coordinate reference system to another with control over resolution, shape, and resampling methods.
9
10
```python { .api }
11
def reproject(
12
self,
13
dst_crs: Any,
14
*,
15
resolution: Optional[Union[float, tuple[float, float]]] = None,
16
shape: Optional[tuple[int, int]] = None,
17
transform: Optional[rasterio.Affine] = None,
18
resampling: rasterio.enums.Resampling = rasterio.enums.Resampling.nearest,
19
nodata: Optional[float] = None,
20
**kwargs
21
) -> Union[xarray.DataArray, xarray.Dataset]:
22
"""
23
Reproject DataArray or Dataset to a new coordinate system.
24
25
Parameters:
26
- dst_crs: Destination CRS (EPSG, PROJ, WKT string, or CRS object)
27
- resolution: Pixel size in destination CRS units (float or (x_res, y_res))
28
- shape: Output shape (height, width) - cannot use with resolution
29
- transform: Destination affine transform
30
- resampling: Resampling algorithm (default: nearest)
31
- nodata: NoData value for destination (uses source nodata if None)
32
- **kwargs: Additional arguments for rasterio.warp.reproject
33
34
Returns:
35
Reprojected DataArray or Dataset
36
"""
37
```
38
39
#### Usage Examples
40
41
```python
42
import rioxarray
43
from rasterio.enums import Resampling
44
45
# Open data in one CRS
46
da = rioxarray.open_rasterio('utm_data.tif') # UTM projection
47
48
# Reproject to geographic coordinates
49
geo_da = da.rio.reproject('EPSG:4326')
50
51
# Reproject with specific resolution
52
geo_da = da.rio.reproject('EPSG:4326', resolution=0.01) # 0.01 degree pixels
53
54
# Reproject with custom shape
55
geo_da = da.rio.reproject('EPSG:4326', shape=(1000, 2000))
56
57
# Reproject with different resampling
58
geo_da = da.rio.reproject(
59
'EPSG:4326',
60
resolution=0.01,
61
resampling=Resampling.bilinear
62
)
63
64
# Reproject with custom nodata
65
geo_da = da.rio.reproject('EPSG:4326', nodata=-9999)
66
```
67
68
### Match Reprojection
69
70
Reproject one raster to match the grid of another raster exactly.
71
72
```python { .api }
73
def reproject_match(
74
self,
75
match_data_array: Union[xarray.DataArray, xarray.Dataset],
76
*,
77
resampling: rasterio.enums.Resampling = rasterio.enums.Resampling.nearest,
78
**reproject_kwargs
79
) -> Union[xarray.DataArray, xarray.Dataset]:
80
"""
81
Reproject DataArray/Dataset to match another DataArray/Dataset grid.
82
83
Parameters:
84
- match_data_array: Target grid to match (DataArray or Dataset)
85
- resampling: Resampling algorithm (default: nearest)
86
- **reproject_kwargs: Additional arguments for reproject
87
88
Returns:
89
Reprojected data matching target grid
90
"""
91
```
92
93
#### Usage Examples
94
95
```python
96
import rioxarray
97
98
# Open two datasets with different grids
99
da1 = rioxarray.open_rasterio('data1.tif') # 30m resolution
100
da2 = rioxarray.open_rasterio('data2.tif') # 10m resolution
101
102
# Reproject da1 to match da2's grid exactly
103
da1_matched = da1.rio.reproject_match(da2)
104
105
# Now they have identical grids for analysis
106
assert da1_matched.rio.crs == da2.rio.crs
107
assert da1_matched.rio.shape == da2.rio.shape
108
assert da1_matched.rio.transform() == da2.rio.transform()
109
110
# Reproject with bilinear resampling
111
da1_matched = da1.rio.reproject_match(da2, resampling=Resampling.bilinear)
112
```
113
114
### Geometry Clipping
115
116
Clip raster data using vector geometries (polygons, etc.) with support for various masking options.
117
118
```python { .api }
119
def clip(
120
self,
121
geometries: Iterable,
122
crs: Optional[Any] = None,
123
*,
124
all_touched: bool = False,
125
drop: bool = True,
126
invert: bool = False,
127
from_disk: bool = False
128
) -> Union[xarray.DataArray, xarray.Dataset]:
129
"""
130
Clip raster by geojson-like geometry objects.
131
132
Parameters:
133
- geometries: List of GeoJSON geometry dicts or objects with __geo_interface__
134
- crs: CRS of input geometries (assumes same as dataset if None)
135
- all_touched: Include all pixels touched by geometries (default: False)
136
- drop: Drop data outside geometries vs. mask with nodata (default: True)
137
- invert: Invert mask (set inside geometries to nodata) (default: False)
138
- from_disk: Clip directly from disk for memory efficiency (default: False)
139
140
Returns:
141
Clipped DataArray or Dataset
142
"""
143
```
144
145
#### Usage Examples
146
147
```python
148
import rioxarray
149
import json
150
151
# Load raster data
152
da = rioxarray.open_rasterio('large_raster.tif')
153
154
# Define clipping geometry
155
geometry = {
156
"type": "Polygon",
157
"coordinates": [[
158
[-94.07955, 41.69086],
159
[-94.06082, 41.69103],
160
[-94.06063, 41.67932],
161
[-94.07936, 41.67915],
162
[-94.07955, 41.69086]
163
]]
164
}
165
166
# Clip to polygon
167
clipped = da.rio.clip([geometry], crs='EPSG:4326')
168
169
# Clip with all touched pixels included
170
clipped_all = da.rio.clip([geometry], crs='EPSG:4326', all_touched=True)
171
172
# Mask (don't drop) area outside geometry
173
masked = da.rio.clip([geometry], crs='EPSG:4326', drop=False)
174
175
# Invert mask (mask inside geometry)
176
inverted = da.rio.clip([geometry], crs='EPSG:4326', invert=True)
177
178
# Clip large data from disk for memory efficiency
179
clipped_disk = da.rio.clip([geometry], crs='EPSG:4326', from_disk=True)
180
```
181
182
### Bounding Box Clipping
183
184
Clip raster data to rectangular bounding boxes defined by coordinate extents.
185
186
```python { .api }
187
def clip_box(
188
self,
189
minx: float,
190
miny: float,
191
maxx: float,
192
maxy: float,
193
crs: Optional[Any] = None,
194
auto_expand: bool = False,
195
auto_expand_limit: int = 3
196
) -> Union[xarray.DataArray, xarray.Dataset]:
197
"""
198
Clip raster data to a bounding box.
199
200
Parameters:
201
- minx: Minimum x coordinate
202
- miny: Minimum y coordinate
203
- maxx: Maximum x coordinate
204
- maxy: Maximum y coordinate
205
- crs: CRS of bounding box coordinates (assumes same as dataset if None)
206
- auto_expand: Expand bbox if needed to include full pixels (default: False)
207
- auto_expand_limit: Maximum pixels to expand bbox (default: 3)
208
209
Returns:
210
Clipped DataArray or Dataset
211
"""
212
```
213
214
#### Usage Examples
215
216
```python
217
import rioxarray
218
219
# Load raster data
220
da = rioxarray.open_rasterio('raster.tif')
221
222
# Clip to bounding box in same CRS as data
223
clipped = da.rio.clip_box(
224
minx=100000, miny=200000,
225
maxx=150000, maxy=250000
226
)
227
228
# Clip using geographic coordinates
229
clipped_geo = da.rio.clip_box(
230
minx=-10, miny=40,
231
maxx=10, maxy=60,
232
crs='EPSG:4326'
233
)
234
235
# Auto-expand to include full pixels at boundaries
236
clipped_expanded = da.rio.clip_box(
237
minx=-10, miny=40, maxx=10, maxy=60,
238
crs='EPSG:4326',
239
auto_expand=True
240
)
241
```
242
243
### Padding Operations
244
245
Expand raster data by adding pixels around the edges or to encompass specific areas.
246
247
```python { .api }
248
def pad_xy(
249
self,
250
minx: float,
251
miny: float,
252
maxx: float,
253
maxy: float,
254
constant_values: Union[int, float] = 0
255
) -> xarray.DataArray:
256
"""
257
Pad DataArray in x/y directions to include specified coordinates.
258
259
Parameters:
260
- minx: Minimum x coordinate to include
261
- miny: Minimum y coordinate to include
262
- maxx: Maximum x coordinate to include
263
- maxy: Maximum y coordinate to include
264
- constant_values: Value to use for padding (default: 0)
265
266
Returns:
267
Padded DataArray
268
"""
269
270
def pad_box(
271
self,
272
minx: float,
273
miny: float,
274
maxx: float,
275
maxy: float,
276
crs: Optional[Any] = None,
277
constant_values: Union[int, float] = 0
278
) -> Union[xarray.DataArray, xarray.Dataset]:
279
"""
280
Pad Dataset/DataArray to encompass a bounding box.
281
282
Parameters:
283
- minx, miny, maxx, maxy: Bounding box coordinates
284
- crs: CRS of bounding box (assumes same as dataset if None)
285
- constant_values: Value for padded pixels (default: 0)
286
287
Returns:
288
Padded Dataset or DataArray
289
"""
290
```
291
292
#### Usage Examples
293
294
```python
295
import rioxarray
296
297
da = rioxarray.open_rasterio('small_raster.tif')
298
299
# Pad to include specific coordinates
300
padded = da.rio.pad_xy(
301
minx=da.rio.bounds()[0] - 1000, # Extend 1000m in each direction
302
miny=da.rio.bounds()[1] - 1000,
303
maxx=da.rio.bounds()[2] + 1000,
304
maxy=da.rio.bounds()[3] + 1000,
305
constant_values=-9999
306
)
307
308
# Pad to encompass a larger area
309
padded_box = da.rio.pad_box(
310
minx=-10, miny=40, maxx=10, maxy=60,
311
crs='EPSG:4326',
312
constant_values=0
313
)
314
```
315
316
### Window and Coordinate Selection
317
318
Select data by pixel windows or coordinate ranges.
319
320
```python { .api }
321
def isel_window(
322
self,
323
window: rasterio.windows.Window
324
) -> Union[xarray.DataArray, xarray.Dataset]:
325
"""
326
Index selection using a rasterio Window.
327
328
Parameters:
329
- window: rasterio Window object defining pixel ranges
330
331
Returns:
332
Subset of data corresponding to window
333
"""
334
335
def slice_xy(
336
self,
337
minx: Optional[float] = None,
338
miny: Optional[float] = None,
339
maxx: Optional[float] = None,
340
maxy: Optional[float] = None
341
) -> Union[xarray.DataArray, xarray.Dataset]:
342
"""
343
Slice data by x/y coordinate bounds.
344
345
Parameters:
346
- minx, miny, maxx, maxy: Coordinate bounds (None to skip bound)
347
348
Returns:
349
Sliced Dataset or DataArray
350
"""
351
```
352
353
#### Usage Examples
354
355
```python
356
import rioxarray
357
from rasterio.windows import Window
358
359
da = rioxarray.open_rasterio('raster.tif')
360
361
# Select by pixel window
362
window = Window(col_off=100, row_off=50, width=200, height=150)
363
windowed = da.rio.isel_window(window)
364
365
# Slice by coordinates
366
subset = da.rio.slice_xy(minx=100000, maxx=200000)
367
368
# Slice with partial bounds
369
subset_y = da.rio.slice_xy(miny=40, maxy=60) # Only constrain Y
370
```
371
372
### Interpolation
373
374
Fill missing or nodata values using spatial interpolation methods.
375
376
```python { .api }
377
def interpolate_na(
378
self,
379
method: str = "linear",
380
**kwargs
381
) -> Union[xarray.DataArray, xarray.Dataset]:
382
"""
383
Interpolate missing/nodata values.
384
385
Parameters:
386
- method: Interpolation method ('linear', 'nearest', 'cubic', etc.)
387
- **kwargs: Additional arguments for xarray.interpolate_na
388
389
Returns:
390
Data with interpolated values
391
"""
392
```
393
394
#### Usage Examples
395
396
```python
397
import rioxarray
398
399
# Open data with missing values
400
da = rioxarray.open_rasterio('data_with_gaps.tif')
401
402
# Linear interpolation of missing values
403
interpolated = da.rio.interpolate_na(method='linear')
404
405
# Nearest neighbor interpolation
406
interpolated_nn = da.rio.interpolate_na(method='nearest')
407
408
# Cubic interpolation along specific dimensions
409
interpolated_cubic = da.rio.interpolate_na(
410
method='cubic',
411
dim=['x', 'y']
412
)
413
```
414
415
## Resampling Methods
416
417
Available resampling algorithms from rasterio for reprojection operations:
418
419
```python { .api }
420
# Common resampling methods
421
Resampling.nearest # Nearest neighbor (default, preserves values)
422
Resampling.bilinear # Bilinear interpolation (smooth, good for continuous data)
423
Resampling.cubic # Cubic convolution (smoother than bilinear)
424
Resampling.cubic_spline # Cubic spline interpolation
425
Resampling.lanczos # Lanczos windowed sinc resampling
426
Resampling.average # Average of all pixels (good for downsampling)
427
Resampling.mode # Most common value (good for categorical data)
428
Resampling.gauss # Gaussian kernel resampling
429
```
430
431
### Choosing Resampling Methods
432
433
```python
434
import rioxarray
435
from rasterio.enums import Resampling
436
437
da = rioxarray.open_rasterio('data.tif')
438
439
# For categorical data (land cover, classes)
440
categorical = da.rio.reproject('EPSG:4326', resampling=Resampling.nearest)
441
442
# For continuous data (temperature, elevation)
443
continuous = da.rio.reproject('EPSG:4326', resampling=Resampling.bilinear)
444
445
# For downsampling (reducing resolution)
446
downsampled = da.rio.reproject('EPSG:4326', resampling=Resampling.average)
447
448
# For high-quality upsampling
449
upsampled = da.rio.reproject('EPSG:4326', resampling=Resampling.lanczos)
450
```
451
452
## Advanced Spatial Operations
453
454
### Multi-step Processing Workflows
455
456
```python
457
import rioxarray
458
from rasterio.enums import Resampling
459
460
# Complex spatial processing workflow
461
da = rioxarray.open_rasterio('source.tif')
462
463
# 1. Reproject to standard CRS
464
reprojected = da.rio.reproject('EPSG:4326', resampling=Resampling.bilinear)
465
466
# 2. Clip to area of interest
467
geometry = {...} # Your geometry
468
clipped = reprojected.rio.clip([geometry], crs='EPSG:4326')
469
470
# 3. Pad to ensure full coverage
471
padded = clipped.rio.pad_box(
472
minx=clipped.rio.bounds()[0] - 0.1,
473
miny=clipped.rio.bounds()[1] - 0.1,
474
maxx=clipped.rio.bounds()[2] + 0.1,
475
maxy=clipped.rio.bounds()[3] + 0.1,
476
crs='EPSG:4326'
477
)
478
479
# 4. Interpolate any remaining gaps
480
final = padded.rio.interpolate_na(method='linear')
481
```
482
483
### Performance Considerations
484
485
```python
486
# For large datasets, use from_disk clipping
487
large_clipped = large_da.rio.clip([geometry], from_disk=True)
488
489
# Use appropriate chunking before spatial operations
490
chunked = da.chunk({'x': 2048, 'y': 2048})
491
processed = chunked.rio.reproject('EPSG:4326')
492
493
# For memory-intensive operations, process in chunks
494
bounds = da.rio.bounds()
495
chunk_size = 0.1 # degrees
496
for minx in numpy.arange(bounds[0], bounds[2], chunk_size):
497
for miny in numpy.arange(bounds[1], bounds[3], chunk_size):
498
chunk = da.rio.clip_box(minx, miny, minx+chunk_size, miny+chunk_size)
499
# Process chunk...
500
```