0
# Dataset I/O
1
2
Core functionality for opening, reading, and writing raster datasets. Rasterio supports numerous formats through GDAL including GeoTIFF, NetCDF, HDF5, JPEG2000, and many others.
3
4
## Capabilities
5
6
### Opening Datasets
7
8
Opens raster datasets for reading or writing with comprehensive format and option support.
9
10
```python { .api }
11
def open(fp, mode='r', driver=None, width=None, height=None, count=None,
12
dtype=None, crs=None, transform=None, nodata=None, sharing=False,
13
opener=None, **kwargs):
14
"""
15
Open a dataset for reading or writing.
16
17
Parameters:
18
- fp (str or Path): Dataset path or URI
19
- mode (str): 'r' for read, 'w' for write, 'r+' for read-write, 'a' for append
20
- driver (str): Format driver name (e.g., 'GTiff', 'NetCDF')
21
- width (int): Raster width in pixels (write mode)
22
- height (int): Raster height in pixels (write mode)
23
- count (int): Number of bands (write mode)
24
- dtype (str or numpy.dtype): Data type (write mode)
25
- crs (CRS): Coordinate reference system (write mode)
26
- transform (Affine): Geospatial transformation (write mode)
27
- nodata (number): NoData value (write mode)
28
- sharing (bool): Enable dataset sharing between threads
29
- opener (callable): Custom file opener function
30
- **kwargs: Additional driver-specific options
31
32
Returns:
33
DatasetReader or DatasetWriter: Dataset object
34
"""
35
```
36
37
Usage examples:
38
39
```python
40
# Read-only access
41
with rasterio.open('input.tif') as dataset:
42
data = dataset.read()
43
44
# Create new raster
45
profile = {
46
'driver': 'GTiff',
47
'dtype': 'float32',
48
'width': 256, 'height': 256, 'count': 3,
49
'crs': 'EPSG:4326',
50
'transform': rasterio.transform.from_bounds(-180, -90, 180, 90, 256, 256)
51
}
52
with rasterio.open('output.tif', 'w', **profile) as dst:
53
dst.write(data)
54
55
# Open with specific driver and options
56
with rasterio.open('data.nc', driver='NetCDF', sharing=False) as dataset:
57
band_data = dataset.read(1)
58
```
59
60
### Copying Datasets
61
62
Copies raster datasets with optional format conversion and processing options.
63
64
```python { .api }
65
def copy(src_path, dst_path, driver=None, dtype=None, compress=None,
66
photometric=None, **kwargs):
67
"""
68
Copy a raster dataset to a new location with optional transformations.
69
70
Parameters:
71
- src_path (str): Source dataset path
72
- dst_path (str): Destination dataset path
73
- driver (str): Output format driver
74
- dtype (str or numpy.dtype): Output data type
75
- compress (str): Compression method ('lzw', 'deflate', 'jpeg', etc.)
76
- photometric (str): Photometric interpretation
77
- **kwargs: Additional driver-specific options
78
79
Returns:
80
None
81
"""
82
```
83
84
Usage example:
85
86
```python
87
# Simple copy
88
rasterio.copy('input.tif', 'output.tif')
89
90
# Copy with compression
91
rasterio.copy('input.tif', 'compressed.tif', compress='lzw')
92
93
# Convert format and data type
94
rasterio.copy('input.tif', 'output.jpg', driver='JPEG', dtype='uint8')
95
```
96
97
### Dataset Classes
98
99
#### DatasetReader
100
101
Read-only access to raster datasets with comprehensive metadata and data access methods.
102
103
```python { .api }
104
class DatasetReader:
105
"""Read-only raster dataset."""
106
107
# Properties
108
profile: dict
109
meta: dict
110
driver: str
111
mode: str
112
name: str
113
width: int
114
height: int
115
shape: tuple[int, int]
116
count: int
117
dtypes: list[str]
118
nodatavals: list[float]
119
crs: CRS
120
transform: Affine
121
bounds: BoundingBox
122
res: tuple[float, float]
123
124
def read(self, indexes=None, out=None, window=None, masked=False,
125
out_shape=None, resampling=Resampling.nearest, fill_value=None,
126
out_dtype=None):
127
"""
128
Read raster data.
129
130
Parameters:
131
- indexes (int or sequence): Band index(es) to read (1-based)
132
- out (numpy.ndarray): Pre-allocated output array
133
- window (Window): Spatial subset to read
134
- masked (bool): Return masked array with nodata values masked
135
- out_shape (tuple): Output array shape for resampling
136
- resampling (Resampling): Resampling algorithm
137
- fill_value (number): Fill value for areas outside dataset bounds
138
- out_dtype (numpy.dtype): Output data type
139
140
Returns:
141
numpy.ndarray: Raster data array
142
"""
143
144
def sample(self, xy, indexes=None, masked=False):
145
"""
146
Sample raster values at coordinates.
147
148
Parameters:
149
- xy (iterable): (x, y) coordinate pairs
150
- indexes (sequence): Band indexes to sample
151
- masked (bool): Return masked values
152
153
Returns:
154
generator: Sampled values
155
"""
156
157
def index(self, x, y, op=math.floor, precision=None):
158
"""Convert geographic coordinates to pixel coordinates."""
159
160
def xy(self, row, col, offset='center'):
161
"""Convert pixel coordinates to geographic coordinates."""
162
163
def window(self, left, bottom, right, top):
164
"""Create window from geographic bounds."""
165
166
def window_transform(self, window):
167
"""Get transform for windowed data."""
168
169
def block_windows(self, bidx=0):
170
"""Generate block windows for efficient processing."""
171
172
def block_shapes(self):
173
"""Get internal block shapes."""
174
175
def colormap(self, bidx):
176
"""Get colormap for band."""
177
178
def checksum(self, bidx):
179
"""Calculate checksum for band."""
180
```
181
182
#### DatasetWriter
183
184
Write-enabled raster dataset for creating and modifying raster files.
185
186
```python { .api }
187
class DatasetWriter(DatasetReader):
188
"""Write-enabled raster dataset."""
189
190
def write(self, arr, indexes=None, window=None):
191
"""
192
Write raster data.
193
194
Parameters:
195
- arr (numpy.ndarray): Data to write
196
- indexes (int or sequence): Band index(es) to write to
197
- window (Window): Spatial subset to write to
198
199
Returns:
200
None
201
"""
202
203
def write_colormap(self, bidx, colormap):
204
"""Write colormap for band."""
205
206
def write_mask(self, mask_array, window=None):
207
"""Write dataset mask."""
208
209
def update_tags(self, **kwargs):
210
"""Update dataset tags/metadata."""
211
212
def set_band_description(self, bidx, value):
213
"""Set band description."""
214
215
def set_band_unit(self, bidx, value):
216
"""Set band unit."""
217
218
def build_overviews(self, factors, resampling=Resampling.nearest):
219
"""Build overview images."""
220
```
221
222
### Memory Files
223
224
In-memory raster operations for processing data without disk I/O.
225
226
```python { .api }
227
class MemoryFile:
228
"""In-memory file-like raster dataset."""
229
230
def __init__(self, file_or_bytes=None):
231
"""
232
Initialize memory file.
233
234
Parameters:
235
- file_or_bytes (bytes or file-like): Initial data
236
"""
237
238
def open(self, **kwargs):
239
"""Open as raster dataset."""
240
241
def getvalue(self):
242
"""Get bytes content."""
243
244
def close(self):
245
"""Close memory file."""
246
247
class ZipMemoryFile(MemoryFile):
248
"""Compressed in-memory raster file."""
249
250
def __init__(self, file_or_bytes=None):
251
"""Initialize compressed memory file."""
252
```
253
254
Usage examples:
255
256
```python
257
# Create raster in memory
258
with rasterio.MemoryFile() as memfile:
259
with memfile.open(driver='GTiff', height=100, width=100,
260
count=1, dtype='uint8') as dataset:
261
dataset.write(data, 1)
262
263
# Get bytes for storage or transmission
264
raster_bytes = memfile.getvalue()
265
266
# Work with compressed data
267
with rasterio.ZipMemoryFile() as memfile:
268
# Process compressed raster data
269
pass
270
```
271
272
### Utility Functions
273
274
Additional dataset manipulation utilities.
275
276
```python { .api }
277
def pad(array, transform, pad_width, mode=None, **kwargs):
278
"""
279
Pad array and adjust affine transform matrix.
280
281
Parameters:
282
- array (numpy.ndarray): Input array, for best results a 2D array
283
- transform (Affine): Transform object mapping pixel space to coordinates
284
- pad_width (int): Number of pixels to pad array on all four sides
285
- mode (str or function): Method for determining padded values
286
- **kwargs: Additional options (see numpy.pad for details)
287
288
Returns:
289
tuple: (padded_array, padded_transform) tuple
290
"""
291
292
def band(ds, bidx):
293
"""
294
A dataset and one or more of its bands.
295
296
Parameters:
297
- ds (DatasetReader): An opened rasterio dataset object
298
- bidx (int or sequence): Band number(s), index starting at 1
299
300
Returns:
301
Band: Named tuple with dataset, band index, dtype, and shape
302
"""
303
304
def delete(path):
305
"""
306
Delete raster dataset and associated files.
307
308
Parameters:
309
- path (str): Dataset path to delete
310
311
Returns:
312
None
313
"""
314
```
315
316
Usage examples:
317
318
```python
319
# Pad array with zeros and adjust transform
320
padded_array, padded_transform = rasterio.pad(data, original_transform, 10, mode='constant', constant_values=0)
321
322
# Delete dataset and auxiliary files
323
rasterio.delete('temporary.tif')
324
```