Tessl Tile for npm/@ipld/car@5.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

buffer-writing.md index.md indexed-reading.md indexing.md iteration.md reading.md writing.md

tile.json

indexed-reading.mddocs/

0
# Indexed Reading (Node.js)
1

2
File-based indexed reading with random access capabilities for large CAR files. CarIndexedReader pre-indexes CAR files and maintains an open file descriptor for efficient block retrieval by CID. **This functionality is only available in Node.js.**
3

4
## Capabilities
5

6
### CarIndexedReader Class
7

8
Provides memory-efficient random access to large CAR files through pre-built indices.
9

10
```typescript { .api }
11
/**
12
 * File-based CAR reader with pre-built index for random access
13
 * Maintains open file descriptor and in-memory CID-to-location mapping
14
 * Significantly more memory efficient than CarReader for large files
15
 * Node.js only - not available in browser environments
16
 */
17
class CarIndexedReader {
18
  /** CAR version number (1 or 2) */
19
  readonly version: number;
20
  
21
  /** Get the list of root CIDs from the CAR header */  
22
  getRoots(): Promise<CID[]>;
23
  
24
  /** Check whether a given CID exists within the CAR */
25
  has(key: CID): Promise<boolean>;
26
  
27
  /** Fetch a Block from the CAR by CID, returns undefined if not found */
28
  get(key: CID): Promise<Block | undefined>;
29
  
30
  /** Returns async iterator over all blocks in the CAR */
31
  blocks(): AsyncGenerator<Block>;
32
  
33
  /** Returns async iterator over all CIDs in the CAR */
34
  cids(): AsyncGenerator<CID>;
35
  
36
  /** Close the underlying file descriptor - must be called for cleanup */
37
  close(): Promise<void>;
38
  
39
  /** Create indexed reader from file path, builds complete index in memory */
40
  static fromFile(path: string): Promise<CarIndexedReader>;
41
}
42
```
43

44
**Usage Examples:**
45

46
```typescript
47
import { CarIndexedReader } from "@ipld/car/indexed-reader";
48

49
// Create indexed reader from file
50
const reader = await CarIndexedReader.fromFile('large-archive.car');
51

52
// Random access to blocks (very efficient)
53
const roots = await reader.getRoots();
54
for (const root of roots) {
55
  if (await reader.has(root)) {
56
    const block = await reader.get(root);
57
    console.log(`Root block ${root}: ${block.bytes.length} bytes`);
58
  }
59
}
60

61
// Iterate through all blocks (uses index for efficient access)
62
for await (const block of reader.blocks()) {
63
  console.log(`Block ${block.cid}: ${block.bytes.length} bytes`);
64
}
65

66
// Important: Always close when done
67
await reader.close();
68
```
69

70
### Efficient Random Access Patterns
71

72
Optimize random access patterns for large CAR files.
73

74
```typescript
75
import { CarIndexedReader } from "@ipld/car/indexed-reader";
76

77
// Pattern 1: Bulk CID lookups
78
const reader = await CarIndexedReader.fromFile('massive-archive.car');
79
const targetCids = [cid1, cid2, cid3, cid4, cid5];
80

81
// Efficient bulk checking
82
const existingCids = [];
83
for (const cid of targetCids) {
84
  if (await reader.has(cid)) {
85
    existingCids.push(cid);
86
  }
87
}
88

89
// Bulk retrieval
90
const blocks = new Map();
91
for (const cid of existingCids) {
92
  const block = await reader.get(cid);
93
  blocks.set(cid.toString(), block);
94
}
95

96
await reader.close();
97
console.log(`Retrieved ${blocks.size} of ${targetCids.length} requested blocks`);
98
```
99

100
### Index-Based Processing
101

102
Use the pre-built index for efficient processing patterns.
103

104
```typescript
105
import { CarIndexedReader } from "@ipld/car/indexed-reader";
106

107
// Pattern 2: Selective processing with CID-first approach
108
const reader = await CarIndexedReader.fromFile('data.car');
109

110
// First, identify all CIDs of interest
111
const targetCids = [];
112
for await (const cid of reader.cids()) {
113
  if (isInterestingCid(cid)) {
114
    targetCids.push(cid);
115
  }
116
}
117

118
// Then efficiently retrieve only the blocks we need
119
for (const cid of targetCids) {
120
  const block = await reader.get(cid);
121
  await processBlock(block);
122
}
123

124
await reader.close();
125
```
126

127
### Memory vs. CarReader Comparison
128

129
Compare memory usage between CarReader and CarIndexedReader.
130

131
```typescript
132
import { CarReader } from "@ipld/car/reader";
133
import { CarIndexedReader } from "@ipld/car/indexed-reader";
134
import fs from 'fs';
135

136
// CarReader: Loads entire CAR into memory
137
const carBytes = fs.readFileSync('large-archive.car'); // Full file in memory
138
const memoryReader = await CarReader.fromBytes(carBytes); // All blocks in memory
139

140
// CarIndexedReader: Only index in memory, blocks loaded on demand
141
const indexedReader = await CarIndexedReader.fromFile('large-archive.car'); // Only index in memory
142

143
// Both provide same interface, but very different memory usage
144
const block1 = await memoryReader.get(someCid);   // Retrieved from memory
145
const block2 = await indexedReader.get(someCid);  // Read from disk on demand
146

147
// Cleanup
148
await indexedReader.close(); // CarReader has no cleanup needed
149
```
150

151
### Integration with File Processing
152

153
Combine indexed reading with file operations.
154

155
```typescript
156
import { CarIndexedReader } from "@ipld/car/indexed-reader";
157
import fs from 'fs';
158

159
// Process multiple CAR files efficiently
160
const carFiles = ['archive1.car', 'archive2.car', 'archive3.car'];
161
const allBlocks = new Map();
162

163
for (const filePath of carFiles) {
164
  const reader = await CarIndexedReader.fromFile(filePath);
165
  
166
  try {
167
    // Collect all blocks from this archive
168
    for await (const block of reader.blocks()) {
169
      allBlocks.set(block.cid.toString(), {
170
        block,
171
        source: filePath
172
      });
173
    }
174
  } finally {
175
    await reader.close(); // Always close, even on errors
176
  }
177
}
178

179
console.log(`Collected ${allBlocks.size} blocks from ${carFiles.length} archives`);
180
```
181

182
### Resource Management
183

184
Proper resource management with file descriptors.
185

186
```typescript
187
import { CarIndexedReader } from "@ipld/car/indexed-reader";
188

189
// Pattern: Using try/finally for cleanup
190
let reader;
191
try {
192
  reader = await CarIndexedReader.fromFile('archive.car');
193
  
194
  // Do work with reader
195
  const roots = await reader.getRoots();
196
  for (const root of roots) {
197
    const block = await reader.get(root);
198
    await processBlock(block);
199
  }
200
  
201
} finally {
202
  // Always cleanup file descriptor
203
  if (reader) {
204
    await reader.close();
205
  }
206
}
207

208
// Pattern: Using async/await with explicit cleanup
209
async function processCarFile(filePath) {
210
  const reader = await CarIndexedReader.fromFile(filePath);
211
  
212
  try {
213
    // Process file
214
    return await doWorkWithReader(reader);
215
  } finally {
216
    await reader.close();
217
  }
218
}
219
```
220

221
### Error Handling
222

223
Handle errors specific to file-based operations.
224

225
```typescript
226
import { CarIndexedReader } from "@ipld/car/indexed-reader";
227

228
// File access errors
229
try {
230
  const reader = await CarIndexedReader.fromFile('nonexistent.car');
231
} catch (error) {
232
  if (error.code === 'ENOENT') {
233
    console.log('CAR file not found');
234
  } else if (error.code === 'EACCES') {
235
    console.log('Permission denied accessing CAR file');
236
  }
237
}
238

239
// Invalid file format errors  
240
try {
241
  const reader = await CarIndexedReader.fromFile('invalid.car');
242
} catch (error) {
243
  if (error.message.includes('Invalid CAR')) {
244
    console.log('File is not a valid CAR archive');
245
  }
246
}
247

248
// File descriptor errors during operation
249
let reader;
250
try {
251
  reader = await CarIndexedReader.fromFile('archive.car');
252
  const block = await reader.get(someCid);
253
} catch (error) {
254
  if (error.code === 'EBADF') {
255
    console.log('File descriptor error - file may have been closed');
256
  }
257
} finally {
258
  if (reader) {
259
    try {
260
      await reader.close();
261
    } catch (closeError) {
262
      console.log('Error closing reader:', closeError.message);
263
    }
264
  }
265
}
266
```
267

268
## Performance Considerations
269

270
### Memory Usage
271
- **Index Size**: Uses memory for CID-to-location map (typically much smaller than full file)
272
- **Block Loading**: Loads blocks on demand, not all at once
273
- **Large Files**: Can handle CAR files larger than available memory
274

275
### Access Patterns
276
- **Random Access**: Extremely efficient for CID-based lookups
277
- **Sequential Access**: Less efficient than streaming iterators
278
- **Mixed Access**: Good balance for applications needing both patterns
279

280
### File System Considerations
281
- **File Descriptor Limits**: Each CarIndexedReader uses one file descriptor
282
- **Concurrent Access**: Multiple readers can access same file simultaneously
283
- **File Locking**: No file locking - safe for read-only access
284

285
### Use Cases
286
- **Large CAR Analysis**: Process CAR files too large for memory
287
- **Block Servers**: Serve blocks by CID from large archives
288
- **Data Mining**: Random access patterns over archived data
289
- **Content Verification**: Validate specific blocks without full loading
290
- **Selective Extraction**: Extract specific blocks from large archives

Version

Tile

Files

indexed-reading.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

indexed-reading.mddocs/