or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

async-processing.mdbackends.mdcoroutines.mdhigh-level-parsing.mdindex.md

index.mddocs/

0

# ijson

1

2

An iterative JSON parser with standard Python iterator interfaces. ijson allows you to process large JSON data streams without loading entire documents into memory, making it ideal for handling massive JSON files, streaming APIs, and memory-constrained environments.

3

4

## Package Information

5

6

- **Package Name**: ijson

7

- **Language**: Python

8

- **Installation**: `pip install ijson`

9

- **Version**: 3.4.0

10

11

## Core Imports

12

13

```python

14

import ijson

15

```

16

17

For specific parsing functions:

18

19

```python

20

from ijson import parse, items, kvitems, basic_parse

21

```

22

23

For exceptions and utilities:

24

25

```python

26

from ijson.common import JSONError, IncompleteJSONError, ObjectBuilder

27

from ijson.utils import coroutine, sendable_list

28

from ijson import __version__

29

```

30

31

## Basic Usage

32

33

```python

34

import ijson

35

36

# Parse a JSON file iteratively

37

with open('large_file.json', 'rb') as file:

38

# Extract all items from an array under 'data'

39

objects = ijson.items(file, 'data.item')

40

for obj in objects:

41

print(obj)

42

43

# Parse streaming JSON data

44

json_data = '{"users": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]}'

45

users = ijson.items(json_data, 'users.item')

46

for user in users:

47

print(f"Name: {user['name']}, Age: {user['age']}")

48

49

# Get key-value pairs from JSON objects

50

json_data = '{"config": {"debug": true, "timeout": 30, "retries": 3}}'

51

config_items = ijson.kvitems(json_data, 'config')

52

for key, value in config_items:

53

print(f"{key}: {value}")

54

```

55

56

## Architecture

57

58

ijson uses a multi-backend architecture for optimal performance across different environments:

59

60

- **Backend System**: Automatically selects the fastest available backend (yajl2_c, yajl2_cffi, yajl2, yajl, python)

61

- **Event-Driven Parsing**: Low-level events bubble up through coroutine pipelines to higher-level interfaces

62

- **Coroutine Pipeline**: Modular design allows chaining of parsing, filtering, and transformation coroutines

63

- **Memory Efficiency**: Streaming approach processes JSON incrementally without loading full documents

64

65

The library provides multiple parsing levels from low-level events to high-level Python objects, supporting both synchronous and asynchronous operation modes.

66

67

## Capabilities

68

69

### High-Level Parsing

70

71

Core parsing functions that yield Python objects and key-value pairs from JSON streams. These functions handle the most common use cases for processing JSON data without memory constraints.

72

73

```python { .api }

74

def items(source, prefix, map_type=None, buf_size=64*1024, **config):

75

"""Yield complete Python objects found under specified prefix."""

76

77

def kvitems(source, prefix, map_type=None, buf_size=64*1024, **config):

78

"""Yield (key, value) pairs from JSON objects under prefix."""

79

80

def parse(source, buf_size=64*1024, **config):

81

"""Yield (prefix, event, value) tuples with path context."""

82

83

def basic_parse(source, buf_size=64*1024, **config):

84

"""Yield low-level (event, value) parsing events."""

85

```

86

87

[High-Level Parsing](./high-level-parsing.md)

88

89

### Asynchronous Processing

90

91

Async variants of all parsing functions for use with asyncio and async file objects. Enables non-blocking JSON processing in concurrent applications.

92

93

```python { .api }

94

async def items_async(source, prefix, map_type=None, buf_size=64*1024, **config):

95

"""Async version of items() for async file objects."""

96

97

async def kvitems_async(source, prefix, map_type=None, buf_size=64*1024, **config):

98

"""Async version of kvitems() for async file objects."""

99

100

async def parse_async(source, buf_size=64*1024, **config):

101

"""Async version of parse() for async file objects."""

102

103

async def basic_parse_async(source, buf_size=64*1024, **config):

104

"""Async version of basic_parse() for async file objects."""

105

```

106

107

[Asynchronous Processing](./async-processing.md)

108

109

### Low-Level Coroutines

110

111

Coroutine-based parsing pipeline components for building custom JSON processing workflows. These provide maximum flexibility for advanced use cases.

112

113

```python { .api }

114

def basic_parse_coro(target, **config):

115

"""Coroutine for low-level parsing events."""

116

117

def parse_coro(target, **config):

118

"""Coroutine for parsing with path context."""

119

120

def items_coro(target, prefix, map_type=None, **config):

121

"""Coroutine for extracting objects under prefix."""

122

123

def kvitems_coro(target, prefix, map_type=None, **config):

124

"""Coroutine for extracting key-value pairs under prefix."""

125

```

126

127

[Low-Level Coroutines](./coroutines.md)

128

129

### Backend Management

130

131

Backend selection and configuration utilities for optimizing performance based on available libraries and specific requirements.

132

133

```python { .api }

134

def get_backend(backend):

135

"""Import and return specified backend module."""

136

137

ALL_BACKENDS: tuple # All available backends in speed order

138

backend: object # Currently selected backend instance

139

backend_name: str # Name of current backend

140

```

141

142

[Backend Management](./backends.md)

143

144

## Types

145

146

```python { .api }

147

class JSONError(Exception):

148

"""Base exception for all parsing errors."""

149

150

class IncompleteJSONError(JSONError):

151

"""Raised when parser can't read expected data from stream."""

152

153

class ObjectBuilder:

154

"""Incrementally builds objects from JSON parser events."""

155

def __init__(self, map_type=None): ...

156

def event(self, event, value): ...

157

value: object # The object being built

158

159

__version__: str # Package version string (e.g., "3.4.0")

160

```

161

162

## Configuration Options

163

164

Global configuration parameters affecting parsing behavior:

165

166

- **buf_size** (int): Buffer size for reading data (default: 64*1024)

167

- **multiple_values** (bool): Allow multiple top-level JSON values

168

- **use_float** (bool): Use float instead of Decimal for numbers (backend-dependent)

169

- **map_type** (type): Custom mapping type for JSON objects (default: dict)

170

171

## Prefix Syntax

172

173

Path expressions for targeting specific parts of JSON documents:

174

175

- **Root level**: `""` (empty string)

176

- **Object properties**: `"property"`

177

- **Nested properties**: `"parent.child"`

178

- **Array items**: `"array.item"`

179

- **Complex paths**: `"data.users.item.address.street"`

180

181

The prefix system enables precise extraction of data from deeply nested JSON structures without parsing unnecessary parts of the document.

182

183

## Command-Line Utility

184

185

ijson includes a command-line utility for dumping JSON parsing events:

186

187

```python { .api }

188

def dump():

189

"""Command-line utility entry point for dumping ijson events."""

190

```

191

192

**Usage:**

193

194

```bash

195

# Basic event dumping

196

python -c "from ijson.dump import dump; dump()" < data.json

197

198

# Parse with specific method and prefix

199

python -c "from ijson.dump import dump; import sys; sys.argv=['dump', '-m', 'items', '-p', 'data.item']; dump()" < data.json

200

```

201

202

The utility supports:

203

- **Methods**: `basic_parse`, `parse`, `items`, `kvitems`

204

- **Prefix filtering**: For `items` and `kvitems` methods

205

- **Multiple values**: Support for multiple top-level JSON values