or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-zarr

An implementation of chunked, compressed, N-dimensional arrays for Python

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/zarr@3.1.x

To install, run

npx @tessl/cli install tessl/pypi-zarr@3.1.0

0

# Zarr

1

2

Zarr is a comprehensive Python library that provides an implementation of chunked, compressed, N-dimensional arrays designed specifically for parallel computing and large-scale data storage. The library offers advanced features including the ability to create N-dimensional arrays with any NumPy dtype, chunk arrays along any dimension for optimized performance, compress and filter chunks using any NumCodecs codec, and store arrays flexibly across various backends including memory, disk, zip files, and cloud storage like S3.

3

4

Zarr excels in concurrent operations, supporting both parallel reading and writing from multiple threads or processes, and provides hierarchical organization of arrays through groups. The library is particularly valuable for scientific computing, data analysis, and applications requiring efficient storage and access of large multidimensional datasets.

5

6

## Package Information

7

8

- **Package Name**: zarr

9

- **Language**: Python

10

- **Installation**: `pip install zarr`

11

- **Version**: 3.1.2

12

- **Python Requirements**: >=3.11

13

14

## Core Imports

15

16

```python

17

import zarr

18

```

19

20

Common imports for array operations:

21

22

```python

23

from zarr import Array, Group

24

from zarr import open, create, save, load

25

```

26

27

## Basic Usage

28

29

```python

30

import zarr

31

import numpy as np

32

33

# Create a zarr array from numpy array

34

data = np.random.random((1000, 1000))

35

z = zarr.from_array(data, chunks=(100, 100))

36

37

# Create an array directly

38

z = zarr.zeros((10, 10), chunks=(5, 5), dtype='float64')

39

40

# Store and retrieve data

41

z[:5, :5] = 1.0

42

print(z[:5, :5])

43

44

# Save to storage

45

zarr.save('data.zarr', z)

46

47

# Load from storage

48

loaded = zarr.load('data.zarr')

49

50

# Create a group with multiple arrays

51

grp = zarr.group()

52

grp.create_array('temperature', shape=(365, 100, 100), chunks=(1, 50, 50))

53

grp.create_array('humidity', shape=(365, 100, 100), chunks=(1, 50, 50))

54

```

55

56

## Architecture

57

58

Zarr follows a hierarchical data model with several key components:

59

60

- **Arrays**: N-dimensional chunked arrays with compression and filtering capabilities

61

- **Groups**: Hierarchical containers for organizing arrays and sub-groups

62

- **Stores**: Storage backends (memory, filesystem, cloud, etc.) that persist array data and metadata

63

- **Codecs**: Compression and encoding algorithms for optimizing storage and I/O

64

- **Chunks**: Fixed-size blocks that arrays are divided into for parallel processing

65

66

This architecture enables efficient storage and retrieval of large datasets while supporting concurrent access patterns essential for high-performance computing and cloud-native applications.

67

68

## Capabilities

69

70

### Array Creation and Initialization

71

72

Functions for creating zarr arrays with various initialization patterns. These provide the primary entry points for creating new arrays with different fill patterns and from existing data sources.

73

74

```python { .api }

75

def array(data, **kwargs) -> Array: ...

76

def create(shape, **kwargs) -> Array: ...

77

def empty(shape, **kwargs) -> Array: ...

78

def zeros(shape, **kwargs) -> Array: ...

79

def ones(shape, **kwargs) -> Array: ...

80

def full(shape, fill_value, **kwargs) -> Array: ...

81

def from_array(a, **kwargs) -> Array: ...

82

```

83

84

[Array Creation](./array-creation.md)

85

86

### Array and Group Access

87

88

Functions for opening and accessing existing zarr arrays and groups from various storage backends. These functions provide flexible ways to load existing data structures.

89

90

```python { .api }

91

def open(store, **kwargs) -> Array | Group: ...

92

def open_array(store, **kwargs) -> Array: ...

93

def open_group(store, **kwargs) -> Group: ...

94

def open_consolidated(store, **kwargs) -> Group: ...

95

def open_like(a, path, **kwargs) -> Array: ...

96

```

97

98

[Data Access](./data-access.md)

99

100

### Data I/O Operations

101

102

High-level functions for saving and loading zarr data structures to and from storage. These provide convenient interfaces for persistence operations.

103

104

```python { .api }

105

def save(file, *args, **kwargs) -> None: ...

106

def save_array(store, arr, **kwargs) -> None: ...

107

def save_group(store, **kwargs) -> None: ...

108

def load(store, **kwargs) -> Any: ...

109

```

110

111

[Data I/O](./data-io.md)

112

113

### Group Management

114

115

Functions for creating and managing hierarchical group structures. Groups provide organizational capabilities for complex datasets with multiple related arrays.

116

117

```python { .api }

118

def group(store=None, **kwargs) -> Group: ...

119

def create_group(store, **kwargs) -> Group: ...

120

def create_hierarchy(path, **kwargs) -> None: ...

121

```

122

123

[Group Management](./group-management.md)

124

125

### Core Classes

126

127

The fundamental array and group classes that form the core of zarr's object-oriented interface. These classes provide comprehensive functionality for array manipulation and hierarchical data organization.

128

129

```python { .api }

130

class Array:

131

shape: tuple[int, ...]

132

dtype: np.dtype

133

chunks: tuple[int, ...]

134

attrs: dict

135

def __getitem__(self, selection): ...

136

def __setitem__(self, selection, value): ...

137

def resize(self, *args): ...

138

139

class Group:

140

attrs: dict

141

def create_array(self, name, **kwargs) -> Array: ...

142

def create_group(self, name, **kwargs) -> Group: ...

143

def __getitem__(self, key): ...

144

def __setitem__(self, key, value): ...

145

```

146

147

[Core Classes](./core-classes.md)

148

149

### Storage Backends

150

151

Storage backend classes for persisting zarr data across different storage systems. These provide the flexibility to use zarr with various storage infrastructures.

152

153

```python { .api }

154

class MemoryStore: ...

155

class LocalStore: ...

156

class ZipStore: ...

157

class FsspecStore: ...

158

class ObjectStore: ...

159

```

160

161

[Storage Backends](./storage-backends.md)

162

163

### Compression and Codecs

164

165

Codec classes for data compression, transformation, and encoding. These enable efficient storage through various compression algorithms and data transformations.

166

167

```python { .api }

168

class BloscCodec: ...

169

class GzipCodec: ...

170

class ZstdCodec: ...

171

class BytesCodec: ...

172

class TransposeCodec: ...

173

class ShardingCodec: ...

174

```

175

176

[Codecs](./codecs.md)

177

178

### Configuration and Utilities

179

180

Configuration system and utility functions for zarr settings, metadata management, and debugging operations.

181

182

```python { .api }

183

config: Config

184

def consolidate_metadata(store, **kwargs) -> Group: ...

185

def copy(source, dest, **kwargs) -> tuple[int, int, int]: ...

186

def tree(grp, **kwargs) -> Any: ...

187

def print_debug_info() -> None: ...

188

```

189

190

[Configuration](./configuration.md)