CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-zarr

An implementation of chunked, compressed, N-dimensional arrays for Python

Overview
Eval results
Files

data-access.mddocs/

Data Access

Functions for opening and accessing existing zarr arrays and groups from various storage backends. These functions provide flexible ways to load existing data structures with support for different storage formats and access patterns.

Capabilities

Generic Opening Functions

def open(
    store: StoreLike,
    mode: str = 'r',
    cache_attrs: bool = True,
    cache_metadata: bool = True,
    path: str = None,
    **kwargs
) -> Union[Array, Group]

Open an array or group from storage, automatically detecting the type.

Parameters:

  • store: Storage location (path, store object, or store-like mapping)
  • mode: Access mode ('r' for read-only, 'r+' for read-write, 'w' for write)
  • cache_attrs: Whether to cache attributes in memory
  • cache_metadata: Whether to cache metadata in memory
  • path: Path within the store to open

Returns: Array or Group depending on what's stored at the location

Array-Specific Opening

def open_array(
    store: StoreLike,
    mode: str = 'r',
    cache_attrs: bool = True,
    cache_metadata: bool = True,
    path: str = None,
    chunk_store: StoreLike = None,
    storage_options: dict = None,
    zarr_version: int = None,
    **kwargs
) -> Array

Open an existing zarr array from storage.

Parameters:

  • store: Storage location containing the array
  • mode: Access mode
  • chunk_store: Separate storage for chunk data (optional)
  • storage_options: Additional options for storage backend
  • zarr_version: Zarr format version (2 or 3)
def open_like(
    a: ArrayLike,
    path: str,
    mode: str = 'r',
    **kwargs
) -> Array

Open an array with the same properties as an existing array template.

Parameters:

  • a: Template array to copy properties from
  • path: Path to the array to open
  • mode: Access mode

Group-Specific Opening

def open_group(
    store: StoreLike = None,
    mode: str = 'r',
    cache_attrs: bool = True,
    cache_metadata: bool = True,
    synchronizer: Any = None,
    path: str = None,
    chunk_store: StoreLike = None,
    storage_options: dict = None,
    zarr_version: int = None,
    **kwargs
) -> Group

Open an existing zarr group from storage.

Parameters:

  • store: Storage location containing the group
  • mode: Access mode
  • synchronizer: Synchronization primitive for concurrent access
  • zarr_version: Zarr format version

Consolidated Metadata Access

def open_consolidated(
    store: StoreLike,
    mode: str = 'r',
    cache_attrs: bool = True,
    cache_metadata: bool = True,
    use_consolidated: bool = True,
    **kwargs
) -> Group

Open a group that has consolidated metadata for improved performance.

Parameters:

  • store: Storage location with consolidated metadata
  • use_consolidated: Whether to use consolidated metadata (must be True)

Consolidated metadata stores all child array/group metadata in the parent group's metadata, reducing the number of storage operations needed to access nested structures.

Type Definitions

StoreLike = Union[str, os.PathLike, Store, MutableMapping]
ArrayLike = Union[np.ndarray, Array, list, tuple]

Access Modes

  • 'r': Read-only access (default)
  • 'r+': Read-write access to existing arrays/groups
  • 'w': Write mode (overwrites existing data)
  • 'w-': Write mode, fails if array/group exists
  • 'a': Append mode (read-write, creates if doesn't exist)

Usage Examples

Basic Data Access

import zarr

# Open array or group (auto-detect type)
data = zarr.open('data.zarr')

# Open specific array
arr = zarr.open_array('temperature_data.zarr')

# Open specific group  
grp = zarr.open_group('experiment_results.zarr')

# Open with write access
writable_arr = zarr.open_array('data.zarr', mode='r+')

Working with Different Storage Backends

from zarr.storage import LocalStore, MemoryStore, ZipStore

# Open from local filesystem
arr = zarr.open_array(LocalStore('path/to/data'))

# Open from ZIP file
arr = zarr.open_array(ZipStore('data.zip'))

# Open from memory store
store = MemoryStore()
arr = zarr.open_array(store)

Cloud Storage Access

# Open from S3-compatible storage using fsspec
import zarr
from zarr.storage import FsspecStore

# Using fsspec for S3
store = FsspecStore('s3://bucket/path/to/data.zarr')
arr = zarr.open_array(store)

# Direct path (requires s3fs installed)
arr = zarr.open('s3://bucket/path/to/data.zarr')

Performance Optimization

# Open with consolidated metadata for faster access
grp = zarr.open_consolidated('large_dataset.zarr')

# Disable metadata caching for memory-constrained environments
arr = zarr.open_array('data.zarr', cache_metadata=False)

# Use separate chunk store for performance
main_store = zarr.storage.LocalStore('metadata/')
chunk_store = zarr.storage.LocalStore('chunks/')
arr = zarr.open_array(main_store, chunk_store=chunk_store)

Template-Based Opening

# Open array with same properties as template
template = zarr.open_array('template.zarr')
new_arr = zarr.open_like(template, 'similar_data.zarr')

# Properties like dtype, chunks, compression are inherited from template

Error Handling

from zarr.errors import ArrayNotFoundError, GroupNotFoundError

try:
    arr = zarr.open_array('nonexistent.zarr')
except ArrayNotFoundError:
    print("Array not found")

try:
    grp = zarr.open_group('missing_group.zarr') 
except GroupNotFoundError:
    print("Group not found")

Install with Tessl CLI

npx tessl i tessl/pypi-zarr

docs

array-creation.md

codecs.md

configuration.md

core-classes.md

data-access.md

data-io.md

group-management.md

index.md

storage-backends.md

tile.json