0
# I/O Operations
1
2
File input/output operations for reading and writing geospatial raster data. These functions provide the primary interface for loading raster files into xarray objects and saving raster data to various formats.
3
4
## Capabilities
5
6
### Reading Raster Files
7
8
Opens raster files using rasterio backend with comprehensive parameter support for performance optimization, coordinate parsing, and data processing options.
9
10
```python { .api }
11
def open_rasterio(
12
filename: Union[str, os.PathLike, rasterio.io.DatasetReader, rasterio.vrt.WarpedVRT],
13
*,
14
parse_coordinates: Optional[bool] = None,
15
chunks: Optional[Union[int, tuple, dict]] = None,
16
cache: Optional[bool] = None,
17
lock: Optional[Any] = None,
18
masked: bool = False,
19
mask_and_scale: bool = False,
20
variable: Optional[Union[str, list[str], tuple[str, ...]]] = None,
21
group: Optional[Union[str, list[str], tuple[str, ...]]] = None,
22
default_name: Optional[str] = None,
23
decode_times: bool = True,
24
decode_timedelta: Optional[bool] = None,
25
band_as_variable: bool = False,
26
**open_kwargs
27
) -> Union[xarray.Dataset, xarray.DataArray, list[xarray.Dataset]]:
28
"""
29
Open a file with rasterio (experimental).
30
31
Parameters:
32
- filename: Path to file or already open rasterio dataset
33
- parse_coordinates: Whether to parse x/y coordinates from transform (default: True for rectilinear)
34
- chunks: Chunk sizes for dask arrays (int, tuple, dict, True, or "auto")
35
- cache: Cache data in memory (default: True unless chunks specified)
36
- lock: Synchronization for parallel access (True, False, or lock instance)
37
- masked: Read mask and set values to NaN (default: False)
38
- mask_and_scale: Apply scales/offsets and masking (default: False)
39
- variable: Variable name(s) to filter loading
40
- group: Group name(s) to filter loading
41
- default_name: Name for data array if none exists
42
- decode_times: Decode time-encoded variables (default: True)
43
- decode_timedelta: Decode timedelta variables (default: same as decode_times)
44
- band_as_variable: Load bands as separate variables (default: False)
45
- **open_kwargs: Additional arguments passed to rasterio.open()
46
47
Returns:
48
xarray.Dataset, xarray.DataArray, or list of Datasets
49
"""
50
```
51
52
#### Usage Examples
53
54
```python
55
import rioxarray
56
57
# Basic usage - open a GeoTIFF file
58
da = rioxarray.open_rasterio('path/to/file.tif')
59
60
# Open with chunking for large files
61
da = rioxarray.open_rasterio('large_file.tif', chunks={'x': 1024, 'y': 1024})
62
63
# Open without parsing coordinates for performance
64
da = rioxarray.open_rasterio('file.tif', parse_coordinates=False)
65
66
# Open with masking applied
67
da = rioxarray.open_rasterio('file.tif', masked=True)
68
69
# Open specific variables from multi-variable file
70
da = rioxarray.open_rasterio('file.nc', variable=['temperature', 'precipitation'])
71
72
# Load bands as separate variables
73
ds = rioxarray.open_rasterio('multi_band.tif', band_as_variable=True)
74
```
75
76
### Writing Raster Files
77
78
Saves DataArrays and Datasets to raster file formats using the `.rio.to_raster()` method available on all xarray objects after importing rioxarray.
79
80
```python { .api }
81
def to_raster(
82
self,
83
raster_path: Union[str, os.PathLike],
84
driver: Optional[str] = None,
85
dtype: Optional[Union[str, numpy.dtype]] = None,
86
tags: Optional[dict] = None,
87
windowed: bool = False,
88
lock: Optional[Any] = None,
89
compute: bool = True,
90
**profile_kwargs
91
) -> None:
92
"""
93
Export DataArray to raster file.
94
95
Parameters:
96
- raster_path: Output file path
97
- driver: GDAL driver name (auto-detected from extension if None)
98
- dtype: Output data type (uses source dtype if None)
99
- tags: Metadata tags to write to file
100
- windowed: Write data in windows for memory efficiency (default: False)
101
- lock: Synchronization for parallel writes
102
- compute: Whether to compute dask arrays (default: True)
103
- **profile_kwargs: Additional rasterio profile parameters
104
105
Returns:
106
None
107
"""
108
```
109
110
#### Usage Examples
111
112
```python
113
import rioxarray
114
import xarray as xr
115
116
# Open and process data
117
da = rioxarray.open_rasterio('input.tif')
118
processed = da * 2 # Some processing
119
120
# Save to GeoTIFF
121
processed.rio.to_raster('output.tif')
122
123
# Save with specific driver and compression
124
processed.rio.to_raster(
125
'output.tif',
126
driver='GTiff',
127
compress='lzw',
128
tiled=True
129
)
130
131
# Save Dataset (multiple variables)
132
ds = xr.Dataset({'var1': da1, 'var2': da2})
133
ds.rio.to_raster('multi_var.tif')
134
135
# Save with custom tags
136
processed.rio.to_raster(
137
'tagged.tif',
138
tags={'processing': 'doubled values', 'created_by': 'rioxarray'}
139
)
140
```
141
142
### Subdataset Filtering
143
144
Helper function for filtering subdatasets in complex raster files like HDF or NetCDF with multiple groups and variables.
145
146
```python { .api }
147
def build_subdataset_filter(
148
group_names: Optional[Union[str, list, tuple]] = None,
149
variable_names: Optional[Union[str, list, tuple]] = None
150
):
151
"""
152
Build regex pattern for filtering subdatasets.
153
154
Parameters:
155
- group_names: Name(s) of groups to filter by
156
- variable_names: Name(s) of variables to filter by
157
158
Returns:
159
re.Pattern: Compiled regex pattern for subdataset filtering
160
"""
161
```
162
163
#### Usage Examples
164
165
```python
166
import rioxarray
167
168
# Filter by variable name
169
pattern = rioxarray.build_subdataset_filter(variable_names='temperature')
170
171
# Filter by group and variable
172
pattern = rioxarray.build_subdataset_filter(
173
group_names='climate_data',
174
variable_names=['temp', 'precip']
175
)
176
177
# Use with subdataset files
178
da = rioxarray.open_rasterio('file.hdf', variable='temperature')
179
```
180
181
## File Format Support
182
183
rioxarray supports any file format that rasterio can open, including:
184
185
- **GeoTIFF** (`.tif`, `.tiff`) - Primary format with full feature support
186
- **NetCDF** (`.nc`) - With geospatial extensions
187
- **HDF** (`.hdf`, `.h5`) - Including subdataset access
188
- **JPEG2000** (`.jp2`) - Compressed format support
189
- **PNG/JPEG** - With world files for georeferencing
190
- **GDAL Virtual Formats** - VRT, WarpedVRT for virtual datasets
191
- **Cloud Optimized GeoTIFF** - Optimized for cloud storage
192
- **Many others** - Any GDAL-supported raster format
193
194
## Performance Considerations
195
196
### Chunking Strategy
197
```python
198
# For large files, use appropriate chunk sizes
199
da = rioxarray.open_rasterio('large.tif', chunks={'x': 2048, 'y': 2048})
200
201
# Auto-chunking based on dask configuration
202
da = rioxarray.open_rasterio('large.tif', chunks='auto')
203
```
204
205
### Caching and Locking
206
```python
207
# Disable caching when using chunks
208
da = rioxarray.open_rasterio('file.tif', chunks=True, cache=False)
209
210
# Parallel access without locks (use carefully)
211
da = rioxarray.open_rasterio('file.tif', lock=False)
212
```
213
214
### Coordinate Parsing
215
```python
216
# Skip coordinate parsing for performance when coordinates not needed
217
da = rioxarray.open_rasterio('file.tif', parse_coordinates=False)
218
```