or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

buffer-writing.mdindex.mdindexed-reading.mdindexing.mditeration.mdreading.mdwriting.md
tile.json

indexed-reading.mddocs/

0

# Indexed Reading (Node.js)

1

2

File-based indexed reading with random access capabilities for large CAR files. CarIndexedReader pre-indexes CAR files and maintains an open file descriptor for efficient block retrieval by CID. **This functionality is only available in Node.js.**

3

4

## Capabilities

5

6

### CarIndexedReader Class

7

8

Provides memory-efficient random access to large CAR files through pre-built indices.

9

10

```typescript { .api }

11

/**

12

* File-based CAR reader with pre-built index for random access

13

* Maintains open file descriptor and in-memory CID-to-location mapping

14

* Significantly more memory efficient than CarReader for large files

15

* Node.js only - not available in browser environments

16

*/

17

class CarIndexedReader {

18

/** CAR version number (1 or 2) */

19

readonly version: number;

20

21

/** Get the list of root CIDs from the CAR header */

22

getRoots(): Promise<CID[]>;

23

24

/** Check whether a given CID exists within the CAR */

25

has(key: CID): Promise<boolean>;

26

27

/** Fetch a Block from the CAR by CID, returns undefined if not found */

28

get(key: CID): Promise<Block | undefined>;

29

30

/** Returns async iterator over all blocks in the CAR */

31

blocks(): AsyncGenerator<Block>;

32

33

/** Returns async iterator over all CIDs in the CAR */

34

cids(): AsyncGenerator<CID>;

35

36

/** Close the underlying file descriptor - must be called for cleanup */

37

close(): Promise<void>;

38

39

/** Create indexed reader from file path, builds complete index in memory */

40

static fromFile(path: string): Promise<CarIndexedReader>;

41

}

42

```

43

44

**Usage Examples:**

45

46

```typescript

47

import { CarIndexedReader } from "@ipld/car/indexed-reader";

48

49

// Create indexed reader from file

50

const reader = await CarIndexedReader.fromFile('large-archive.car');

51

52

// Random access to blocks (very efficient)

53

const roots = await reader.getRoots();

54

for (const root of roots) {

55

if (await reader.has(root)) {

56

const block = await reader.get(root);

57

console.log(`Root block ${root}: ${block.bytes.length} bytes`);

58

}

59

}

60

61

// Iterate through all blocks (uses index for efficient access)

62

for await (const block of reader.blocks()) {

63

console.log(`Block ${block.cid}: ${block.bytes.length} bytes`);

64

}

65

66

// Important: Always close when done

67

await reader.close();

68

```

69

70

### Efficient Random Access Patterns

71

72

Optimize random access patterns for large CAR files.

73

74

```typescript

75

import { CarIndexedReader } from "@ipld/car/indexed-reader";

76

77

// Pattern 1: Bulk CID lookups

78

const reader = await CarIndexedReader.fromFile('massive-archive.car');

79

const targetCids = [cid1, cid2, cid3, cid4, cid5];

80

81

// Efficient bulk checking

82

const existingCids = [];

83

for (const cid of targetCids) {

84

if (await reader.has(cid)) {

85

existingCids.push(cid);

86

}

87

}

88

89

// Bulk retrieval

90

const blocks = new Map();

91

for (const cid of existingCids) {

92

const block = await reader.get(cid);

93

blocks.set(cid.toString(), block);

94

}

95

96

await reader.close();

97

console.log(`Retrieved ${blocks.size} of ${targetCids.length} requested blocks`);

98

```

99

100

### Index-Based Processing

101

102

Use the pre-built index for efficient processing patterns.

103

104

```typescript

105

import { CarIndexedReader } from "@ipld/car/indexed-reader";

106

107

// Pattern 2: Selective processing with CID-first approach

108

const reader = await CarIndexedReader.fromFile('data.car');

109

110

// First, identify all CIDs of interest

111

const targetCids = [];

112

for await (const cid of reader.cids()) {

113

if (isInterestingCid(cid)) {

114

targetCids.push(cid);

115

}

116

}

117

118

// Then efficiently retrieve only the blocks we need

119

for (const cid of targetCids) {

120

const block = await reader.get(cid);

121

await processBlock(block);

122

}

123

124

await reader.close();

125

```

126

127

### Memory vs. CarReader Comparison

128

129

Compare memory usage between CarReader and CarIndexedReader.

130

131

```typescript

132

import { CarReader } from "@ipld/car/reader";

133

import { CarIndexedReader } from "@ipld/car/indexed-reader";

134

import fs from 'fs';

135

136

// CarReader: Loads entire CAR into memory

137

const carBytes = fs.readFileSync('large-archive.car'); // Full file in memory

138

const memoryReader = await CarReader.fromBytes(carBytes); // All blocks in memory

139

140

// CarIndexedReader: Only index in memory, blocks loaded on demand

141

const indexedReader = await CarIndexedReader.fromFile('large-archive.car'); // Only index in memory

142

143

// Both provide same interface, but very different memory usage

144

const block1 = await memoryReader.get(someCid); // Retrieved from memory

145

const block2 = await indexedReader.get(someCid); // Read from disk on demand

146

147

// Cleanup

148

await indexedReader.close(); // CarReader has no cleanup needed

149

```

150

151

### Integration with File Processing

152

153

Combine indexed reading with file operations.

154

155

```typescript

156

import { CarIndexedReader } from "@ipld/car/indexed-reader";

157

import fs from 'fs';

158

159

// Process multiple CAR files efficiently

160

const carFiles = ['archive1.car', 'archive2.car', 'archive3.car'];

161

const allBlocks = new Map();

162

163

for (const filePath of carFiles) {

164

const reader = await CarIndexedReader.fromFile(filePath);

165

166

try {

167

// Collect all blocks from this archive

168

for await (const block of reader.blocks()) {

169

allBlocks.set(block.cid.toString(), {

170

block,

171

source: filePath

172

});

173

}

174

} finally {

175

await reader.close(); // Always close, even on errors

176

}

177

}

178

179

console.log(`Collected ${allBlocks.size} blocks from ${carFiles.length} archives`);

180

```

181

182

### Resource Management

183

184

Proper resource management with file descriptors.

185

186

```typescript

187

import { CarIndexedReader } from "@ipld/car/indexed-reader";

188

189

// Pattern: Using try/finally for cleanup

190

let reader;

191

try {

192

reader = await CarIndexedReader.fromFile('archive.car');

193

194

// Do work with reader

195

const roots = await reader.getRoots();

196

for (const root of roots) {

197

const block = await reader.get(root);

198

await processBlock(block);

199

}

200

201

} finally {

202

// Always cleanup file descriptor

203

if (reader) {

204

await reader.close();

205

}

206

}

207

208

// Pattern: Using async/await with explicit cleanup

209

async function processCarFile(filePath) {

210

const reader = await CarIndexedReader.fromFile(filePath);

211

212

try {

213

// Process file

214

return await doWorkWithReader(reader);

215

} finally {

216

await reader.close();

217

}

218

}

219

```

220

221

### Error Handling

222

223

Handle errors specific to file-based operations.

224

225

```typescript

226

import { CarIndexedReader } from "@ipld/car/indexed-reader";

227

228

// File access errors

229

try {

230

const reader = await CarIndexedReader.fromFile('nonexistent.car');

231

} catch (error) {

232

if (error.code === 'ENOENT') {

233

console.log('CAR file not found');

234

} else if (error.code === 'EACCES') {

235

console.log('Permission denied accessing CAR file');

236

}

237

}

238

239

// Invalid file format errors

240

try {

241

const reader = await CarIndexedReader.fromFile('invalid.car');

242

} catch (error) {

243

if (error.message.includes('Invalid CAR')) {

244

console.log('File is not a valid CAR archive');

245

}

246

}

247

248

// File descriptor errors during operation

249

let reader;

250

try {

251

reader = await CarIndexedReader.fromFile('archive.car');

252

const block = await reader.get(someCid);

253

} catch (error) {

254

if (error.code === 'EBADF') {

255

console.log('File descriptor error - file may have been closed');

256

}

257

} finally {

258

if (reader) {

259

try {

260

await reader.close();

261

} catch (closeError) {

262

console.log('Error closing reader:', closeError.message);

263

}

264

}

265

}

266

```

267

268

## Performance Considerations

269

270

### Memory Usage

271

- **Index Size**: Uses memory for CID-to-location map (typically much smaller than full file)

272

- **Block Loading**: Loads blocks on demand, not all at once

273

- **Large Files**: Can handle CAR files larger than available memory

274

275

### Access Patterns

276

- **Random Access**: Extremely efficient for CID-based lookups

277

- **Sequential Access**: Less efficient than streaming iterators

278

- **Mixed Access**: Good balance for applications needing both patterns

279

280

### File System Considerations

281

- **File Descriptor Limits**: Each CarIndexedReader uses one file descriptor

282

- **Concurrent Access**: Multiple readers can access same file simultaneously

283

- **File Locking**: No file locking - safe for read-only access

284

285

### Use Cases

286

- **Large CAR Analysis**: Process CAR files too large for memory

287

- **Block Servers**: Serve blocks by CID from large archives

288

- **Data Mining**: Random access patterns over archived data

289

- **Content Verification**: Validate specific blocks without full loading

290

- **Selective Extraction**: Extract specific blocks from large archives