CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-rioxarray

Geospatial xarray extension powered by rasterio for raster data manipulation and analysis

Pending
Overview
Eval results
Files

io-operations.mddocs/

I/O Operations

File input/output operations for reading and writing geospatial raster data. These functions provide the primary interface for loading raster files into xarray objects and saving raster data to various formats.

Capabilities

Reading Raster Files

Opens raster files using rasterio backend with comprehensive parameter support for performance optimization, coordinate parsing, and data processing options.

def open_rasterio(
    filename: Union[str, os.PathLike, rasterio.io.DatasetReader, rasterio.vrt.WarpedVRT],
    *,
    parse_coordinates: Optional[bool] = None,
    chunks: Optional[Union[int, tuple, dict]] = None,
    cache: Optional[bool] = None,
    lock: Optional[Any] = None,
    masked: bool = False,
    mask_and_scale: bool = False,
    variable: Optional[Union[str, list[str], tuple[str, ...]]] = None,
    group: Optional[Union[str, list[str], tuple[str, ...]]] = None,
    default_name: Optional[str] = None,
    decode_times: bool = True,
    decode_timedelta: Optional[bool] = None,
    band_as_variable: bool = False,
    **open_kwargs
) -> Union[xarray.Dataset, xarray.DataArray, list[xarray.Dataset]]:
    """
    Open a file with rasterio (experimental).

    Parameters:
    - filename: Path to file or already open rasterio dataset
    - parse_coordinates: Whether to parse x/y coordinates from transform (default: True for rectilinear)
    - chunks: Chunk sizes for dask arrays (int, tuple, dict, True, or "auto")
    - cache: Cache data in memory (default: True unless chunks specified)
    - lock: Synchronization for parallel access (True, False, or lock instance)
    - masked: Read mask and set values to NaN (default: False)
    - mask_and_scale: Apply scales/offsets and masking (default: False)
    - variable: Variable name(s) to filter loading
    - group: Group name(s) to filter loading
    - default_name: Name for data array if none exists
    - decode_times: Decode time-encoded variables (default: True)
    - decode_timedelta: Decode timedelta variables (default: same as decode_times)
    - band_as_variable: Load bands as separate variables (default: False)
    - **open_kwargs: Additional arguments passed to rasterio.open()

    Returns:
    xarray.Dataset, xarray.DataArray, or list of Datasets
    """

Usage Examples

import rioxarray

# Basic usage - open a GeoTIFF file
da = rioxarray.open_rasterio('path/to/file.tif')

# Open with chunking for large files
da = rioxarray.open_rasterio('large_file.tif', chunks={'x': 1024, 'y': 1024})

# Open without parsing coordinates for performance
da = rioxarray.open_rasterio('file.tif', parse_coordinates=False)

# Open with masking applied
da = rioxarray.open_rasterio('file.tif', masked=True)

# Open specific variables from multi-variable file
da = rioxarray.open_rasterio('file.nc', variable=['temperature', 'precipitation'])

# Load bands as separate variables
ds = rioxarray.open_rasterio('multi_band.tif', band_as_variable=True)

Writing Raster Files

Saves DataArrays and Datasets to raster file formats using the .rio.to_raster() method available on all xarray objects after importing rioxarray.

def to_raster(
    self, 
    raster_path: Union[str, os.PathLike], 
    driver: Optional[str] = None,
    dtype: Optional[Union[str, numpy.dtype]] = None,
    tags: Optional[dict] = None,
    windowed: bool = False,
    lock: Optional[Any] = None,
    compute: bool = True,
    **profile_kwargs
) -> None:
    """
    Export DataArray to raster file.

    Parameters:
    - raster_path: Output file path
    - driver: GDAL driver name (auto-detected from extension if None)
    - dtype: Output data type (uses source dtype if None)
    - tags: Metadata tags to write to file
    - windowed: Write data in windows for memory efficiency (default: False)
    - lock: Synchronization for parallel writes
    - compute: Whether to compute dask arrays (default: True)
    - **profile_kwargs: Additional rasterio profile parameters

    Returns:
    None
    """

Usage Examples

import rioxarray
import xarray as xr

# Open and process data
da = rioxarray.open_rasterio('input.tif')
processed = da * 2  # Some processing

# Save to GeoTIFF
processed.rio.to_raster('output.tif')

# Save with specific driver and compression
processed.rio.to_raster(
    'output.tif', 
    driver='GTiff',
    compress='lzw',
    tiled=True
)

# Save Dataset (multiple variables)
ds = xr.Dataset({'var1': da1, 'var2': da2})
ds.rio.to_raster('multi_var.tif')

# Save with custom tags
processed.rio.to_raster(
    'tagged.tif',
    tags={'processing': 'doubled values', 'created_by': 'rioxarray'}
)

Subdataset Filtering

Helper function for filtering subdatasets in complex raster files like HDF or NetCDF with multiple groups and variables.

def build_subdataset_filter(
    group_names: Optional[Union[str, list, tuple]] = None,
    variable_names: Optional[Union[str, list, tuple]] = None
):
    """
    Build regex pattern for filtering subdatasets.

    Parameters:
    - group_names: Name(s) of groups to filter by
    - variable_names: Name(s) of variables to filter by

    Returns:
    re.Pattern: Compiled regex pattern for subdataset filtering
    """

Usage Examples

import rioxarray

# Filter by variable name
pattern = rioxarray.build_subdataset_filter(variable_names='temperature')

# Filter by group and variable
pattern = rioxarray.build_subdataset_filter(
    group_names='climate_data',
    variable_names=['temp', 'precip']
)

# Use with subdataset files
da = rioxarray.open_rasterio('file.hdf', variable='temperature')

File Format Support

rioxarray supports any file format that rasterio can open, including:

  • GeoTIFF (.tif, .tiff) - Primary format with full feature support
  • NetCDF (.nc) - With geospatial extensions
  • HDF (.hdf, .h5) - Including subdataset access
  • JPEG2000 (.jp2) - Compressed format support
  • PNG/JPEG - With world files for georeferencing
  • GDAL Virtual Formats - VRT, WarpedVRT for virtual datasets
  • Cloud Optimized GeoTIFF - Optimized for cloud storage
  • Many others - Any GDAL-supported raster format

Performance Considerations

Chunking Strategy

# For large files, use appropriate chunk sizes
da = rioxarray.open_rasterio('large.tif', chunks={'x': 2048, 'y': 2048})

# Auto-chunking based on dask configuration
da = rioxarray.open_rasterio('large.tif', chunks='auto')

Caching and Locking

# Disable caching when using chunks
da = rioxarray.open_rasterio('file.tif', chunks=True, cache=False)

# Parallel access without locks (use carefully)
da = rioxarray.open_rasterio('file.tif', lock=False)

Coordinate Parsing

# Skip coordinate parsing for performance when coordinates not needed
da = rioxarray.open_rasterio('file.tif', parse_coordinates=False)

Install with Tessl CLI

npx tessl i tessl/pypi-rioxarray

docs

config-utilities.md

coordinate-systems.md

data-management.md

index.md

io-operations.md

spatial-operations.md

tile.json