0
# Tarball Processing
1
2
Extract and process package tarballs with integrity verification and CAFS integration for efficient package storage and management.
3
4
## Capabilities
5
6
### Add Files From Tarball
7
8
Extracts and adds files from tarball buffer to CAFS store with optional integrity verification. This is the primary function for processing downloaded package tarballs.
9
10
```typescript { .api }
11
/**
12
* Extracts and adds files from tarball buffer to CAFS store
13
* @param opts - Configuration options for tarball extraction
14
* @returns Promise resolving to extraction results with file mappings and metadata
15
*/
16
function addFilesFromTarball(opts: AddFilesFromTarballOptions): Promise<AddFilesResult>;
17
18
type AddFilesFromTarballOptions = Pick<TarballExtractMessage, 'buffer' | 'storeDir' | 'filesIndexFile' | 'integrity' | 'readManifest' | 'pkg'> & {
19
/** Source URL for error reporting */
20
url: string;
21
};
22
23
interface TarballExtractMessage {
24
type: 'extract';
25
/** Tarball buffer data to extract */
26
buffer: Buffer;
27
/** Store directory path where files will be stored */
28
storeDir: string;
29
/** Integrity hash for verification (format: algorithm-hash) */
30
integrity?: string;
31
/** Path to the files index file for metadata storage */
32
filesIndexFile: string;
33
/** Whether to read and parse the package manifest */
34
readManifest?: boolean;
35
/** Package name and version information */
36
pkg?: PkgNameVersion;
37
}
38
39
interface AddFilesResult {
40
/** Mapping of file paths to their CAFS store locations */
41
filesIndex: Record<string, string>;
42
/** Parsed package manifest (package.json) */
43
manifest: DependencyManifest;
44
/** Whether the package requires a build step */
45
requiresBuild: boolean;
46
}
47
```
48
49
**Usage Examples:**
50
51
```typescript
52
import { addFilesFromTarball } from "@pnpm/worker";
53
import fs from "fs";
54
55
// Basic tarball extraction
56
const tarballBuffer = fs.readFileSync("package.tgz");
57
const result = await addFilesFromTarball({
58
buffer: tarballBuffer,
59
storeDir: "/path/to/pnpm/store",
60
filesIndexFile: "/path/to/index.json",
61
url: "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz"
62
});
63
64
console.log("Extracted files:", Object.keys(result.filesIndex));
65
console.log("Package name:", result.manifest.name);
66
console.log("Requires build:", result.requiresBuild);
67
68
// With integrity verification
69
const resultWithIntegrity = await addFilesFromTarball({
70
buffer: tarballBuffer,
71
storeDir: "/path/to/pnpm/store",
72
filesIndexFile: "/path/to/index.json",
73
integrity: "sha512-9h7e6r3a5x5b2b5f3fdc2c7d8e1f6r7a9e7q8d4f1d7a8c6b7e8d9f0a1b2c3d4e5f6",
74
readManifest: true,
75
pkg: { name: "lodash", version: "4.17.21" },
76
url: "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz"
77
});
78
```
79
80
## Integrity Verification
81
82
When the `integrity` parameter is provided, the function performs checksum verification:
83
84
1. **Hash Calculation**: Calculates hash of the tarball buffer using the specified algorithm
85
2. **Format**: Expects SRI (Subresource Integrity) format: `algorithm-base64hash`
86
3. **Verification**: Compares calculated hash with expected hash
87
4. **Error Handling**: Throws `TarballIntegrityError` if verification fails
88
89
**Supported Hash Algorithms:**
90
- `sha1` - SHA-1 (legacy)
91
- `sha256` - SHA-256 (recommended)
92
- `sha512` - SHA-512 (most secure)
93
94
## Error Handling
95
96
### Tarball Integrity Error
97
98
Specialized error class for integrity verification failures:
99
100
```typescript { .api }
101
class TarballIntegrityError extends Error {
102
/** Actual hash found during verification */
103
readonly found: string;
104
/** Expected hash from integrity parameter */
105
readonly expected: string;
106
/** Hash algorithm used (sha1, sha256, sha512) */
107
readonly algorithm: string;
108
/** Original SRI string */
109
readonly sri: string;
110
/** Source URL where tarball was downloaded */
111
readonly url: string;
112
113
constructor(opts: {
114
attempts?: number;
115
found: string;
116
expected: string;
117
algorithm: string;
118
sri: string;
119
url: string;
120
});
121
}
122
```
123
124
**Error Handling Example:**
125
126
```typescript
127
import { addFilesFromTarball, TarballIntegrityError } from "@pnpm/worker";
128
129
try {
130
const result = await addFilesFromTarball({
131
buffer: tarballBuffer,
132
storeDir: "/path/to/store",
133
filesIndexFile: "/path/to/index.json",
134
integrity: "sha512-invalid-hash",
135
url: "https://registry.npmjs.org/package/-/package-1.0.0.tgz"
136
});
137
} catch (error) {
138
if (error instanceof TarballIntegrityError) {
139
console.error(`Integrity check failed for ${error.url}`);
140
console.error(`Expected: ${error.expected}, Found: ${error.found}`);
141
console.error(`Algorithm: ${error.algorithm}`);
142
143
// Recommended action: clear cache and retry
144
console.log("Try running: pnpm store prune");
145
} else {
146
console.error("Other extraction error:", error.message);
147
}
148
}
149
```
150
151
## CAFS Integration
152
153
The tarball processing integrates with pnpm's Content-Addressable File System (CAFS):
154
155
- **File Storage**: Files are stored by their content hash for deduplication
156
- **Index Files**: Metadata stored in JSON files for fast lookups
157
- **Build Detection**: Automatically detects if packages require build steps
158
- **Manifest Processing**: Extracts and parses package.json from tarballs