Tessl Tile for npm/@ipld/car@5.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

buffer-writing.md index.md indexed-reading.md indexing.md iteration.md reading.md writing.md

tile.json

indexing.mddocs/

0
# CAR Indexing
1

2
Efficient indexing functionality for creating block indices and enabling random access to large CAR files. CarIndexer processes CAR archives to generate location metadata for each block without loading block data into memory.
3

4
## Capabilities
5

6
### CarIndexer Class
7

8
Provides efficient indexing of CAR archives with streaming block location generation.
9

10
```typescript { .api }
11
/**
12
 * Creates block indices for CAR archives
13
 * Processes header and generates BlockIndex entries for each block
14
 * Implements AsyncIterable for streaming index generation
15
 */
16
class CarIndexer {
17
  /** CAR version number (1 or 2) */
18
  readonly version: number;
19
  
20
  /** Get the list of root CIDs from the CAR header */
21
  getRoots(): Promise<CID[]>;
22
  
23
  /** Iterate over all block indices in the CAR */
24
  [Symbol.asyncIterator](): AsyncIterator<BlockIndex>;
25
  
26
  /** Create indexer from Uint8Array */
27
  static fromBytes(bytes: Uint8Array): Promise<CarIndexer>;
28
  
29
  /** Create indexer from async stream */
30
  static fromIterable(asyncIterable: AsyncIterable<Uint8Array>): Promise<CarIndexer>;
31
}
32

33
/**
34
 * Block index containing location and size information
35
 */
36
interface BlockIndex {
37
  /** CID of the block */
38
  cid: CID;
39
  /** Total length including CID encoding */
40
  length: number;
41
  /** Length of block data only (excludes CID) */
42
  blockLength: number;
43
  /** Byte offset of entire block entry in CAR */
44
  offset: number;
45
  /** Byte offset of block data (after CID) in CAR */
46
  blockOffset: number;
47
}
48
```
49

50
**Usage Examples:**
51

52
```typescript
53
import { CarIndexer } from "@ipld/car/indexer";
54
import fs from 'fs';
55

56
// Index from bytes
57
const carBytes = fs.readFileSync('archive.car');
58
const indexer = await CarIndexer.fromBytes(carBytes);
59

60
// Index from stream (more memory efficient)
61
const stream = fs.createReadStream('large-archive.car');
62
const streamIndexer = await CarIndexer.fromIterable(stream);
63

64
// Access roots
65
const roots = await indexer.getRoots();
66
console.log(`Indexing CAR with ${roots.length} roots`);
67

68
// Iterate through block indices
69
for await (const blockIndex of indexer) {
70
  console.log(`Block ${blockIndex.cid}:`);
71
  console.log(`  Total length: ${blockIndex.length}`);
72
  console.log(`  Block data length: ${blockIndex.blockLength}`);
73
  console.log(`  Starts at byte: ${blockIndex.offset}`);
74
  console.log(`  Block data at byte: ${blockIndex.blockOffset}`);
75
}
76
```
77

78
### Building Block Location Maps
79

80
Create lookup maps for random access to blocks by CID.
81

82
```typescript
83
import { CarIndexer } from "@ipld/car/indexer";
84
import fs from 'fs';
85

86
// Build complete index map
87
const stream = fs.createReadStream('archive.car');
88
const indexer = await CarIndexer.fromIterable(stream);
89

90
const blockMap = new Map();
91
const sizeStats = { totalBlocks: 0, totalBytes: 0 };
92

93
for await (const blockIndex of indexer) {
94
  // Store location info by CID string
95
  blockMap.set(blockIndex.cid.toString(), {
96
    offset: blockIndex.offset,
97
    blockOffset: blockIndex.blockOffset,
98
    blockLength: blockIndex.blockLength
99
  });
100
  
101
  // Collect statistics
102
  sizeStats.totalBlocks++;
103
  sizeStats.totalBytes += blockIndex.blockLength;
104
}
105

106
console.log(`Indexed ${sizeStats.totalBlocks} blocks, ${sizeStats.totalBytes} total bytes`);
107

108
// Use map for random access
109
const targetCid = someTargetCid;
110
const location = blockMap.get(targetCid.toString());
111
if (location) {
112
  console.log(`Block ${targetCid} found at offset ${location.blockOffset}`);
113
}
114
```
115

116
### Integration with Raw Reading
117

118
Combine indexing with raw block reading for efficient random access.
119

120
```typescript
121
import { CarIndexer } from "@ipld/car/indexer";
122
import { CarReader } from "@ipld/car/reader";
123
import fs from 'fs';
124

125
// Index and read specific blocks
126
const fd = await fs.promises.open('large-archive.car', 'r');
127
const stream = fs.createReadStream('large-archive.car');
128
const indexer = await CarIndexer.fromIterable(stream);
129

130
// Find and read specific blocks
131
const targetCids = [cid1, cid2, cid3];
132
const foundBlocks = new Map();
133

134
for await (const blockIndex of indexer) {
135
  const cidStr = blockIndex.cid.toString();
136
  
137
  if (targetCids.some(cid => cid.toString() === cidStr)) {
138
    // Read only the blocks we need
139
    const block = await CarReader.readRaw(fd, blockIndex);
140
    foundBlocks.set(cidStr, block);
141
    
142
    // Stop early if we found all targets
143
    if (foundBlocks.size === targetCids.length) {
144
      break;
145
    }
146
  }
147
}
148

149
await fd.close();
150
console.log(`Found ${foundBlocks.size} of ${targetCids.length} target blocks`);
151
```
152

153
### Memory-Efficient Large File Processing
154

155
Process large CAR files without loading entire contents into memory.
156

157
```typescript
158
import { CarIndexer } from "@ipld/car/indexer";
159
import fs from 'fs';
160

161
// Process very large CAR file efficiently
162
const stream = fs.createReadStream('massive-archive.car');
163
const indexer = await CarIndexer.fromIterable(stream);
164

165
let processedCount = 0;
166
let processedBytes = 0;
167

168
for await (const blockIndex of indexer) {
169
  // Process blocks in chunks or apply filtering
170
  if (shouldProcessBlock(blockIndex.cid)) {
171
    await processBlockIndex(blockIndex);
172
    processedCount++;
173
    processedBytes += blockIndex.blockLength;
174
    
175
    // Progress reporting
176
    if (processedCount % 1000 === 0) {
177
      console.log(`Processed ${processedCount} blocks, ${processedBytes} bytes`);
178
    }
179
  }
180
}
181

182
console.log(`Completed processing: ${processedCount} blocks`);
183
```
184

185
### Error Handling
186

187
Common errors when indexing CAR files:
188

189
- **TypeError**: Invalid input types (not Uint8Array or async iterable)
190
- **Error**: Malformed CAR data, invalid headers, unexpected end of data
191
- **Iteration Errors**: Can only iterate once per CarIndexer instance
192

193
```typescript
194
try {
195
  const indexer = await CarIndexer.fromBytes(invalidData);
196
} catch (error) {
197
  if (error instanceof TypeError) {
198
    console.log('Invalid input format');
199
  } else if (error.message.includes('Invalid CAR')) {
200
    console.log('Malformed CAR file');
201
  }
202
}
203

204
// Iteration can only be performed once
205
const indexer = await CarIndexer.fromBytes(carBytes);
206

207
// First iteration works
208
for await (const blockIndex of indexer) {
209
  // Process blocks
210
}
211

212
// Second iteration will not work - need new indexer instance
213
// for await (const blockIndex of indexer) { // Won't iterate
214
```
215

216
## Performance Considerations
217

218
### Memory Usage
219
- CarIndexer uses minimal memory - only processes one block index at a time
220
- Block data is never loaded into memory during indexing
221
- Suitable for indexing very large CAR files
222

223
### Processing Speed
224
- Indexing speed depends on stream/disk I/O performance
225
- Processing thousands of blocks per second is typical
226
- Use `fromIterable()` with file streams for best memory efficiency
227

228
### Use Cases
229
- **Random Access Preparation**: Build indices for later block lookups
230
- **CAR Analysis**: Analyze CAR structure without loading block data  
231
- **Selective Processing**: Identify blocks of interest before reading data
232
- **Statistics Generation**: Count blocks, analyze size distributions
233
- **Validation**: Verify CAR structure integrity

Version

Tile

Files

indexing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

indexing.mddocs/