or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-compression.mdadvanced-decompression.mdbuffer-operations.mddictionary-compression.mdframe-analysis.mdindex.mdsimple-operations.md

frame-analysis.mddocs/

0

# Frame Analysis

1

2

Utilities for analyzing zstd frames and extracting metadata without full decompression, enabling efficient frame inspection and validation.

3

4

## Capabilities

5

6

### Frame Content Size

7

8

Extract the original content size from a zstd frame header without decompressing the data.

9

10

```python { .api }

11

def frame_content_size(data: bytes) -> int:

12

"""

13

Get the original content size from a zstd frame.

14

15

Parameters:

16

- data: bytes, zstd frame data (at least frame header)

17

18

Returns:

19

int: Original content size in bytes, or special values:

20

- CONTENTSIZE_UNKNOWN: Content size not stored in frame

21

- CONTENTSIZE_ERROR: Invalid frame or unable to determine size

22

"""

23

```

24

25

**Usage Example:**

26

27

```python

28

import zstandard as zstd

29

30

# Compressed data with content size in header

31

compressor = zstd.ZstdCompressor(write_content_size=True)

32

original_data = b"Hello, World!" * 1000

33

compressed = compressor.compress(original_data)

34

35

# Get content size without decompressing

36

content_size = zstd.frame_content_size(compressed)

37

38

if content_size == zstd.CONTENTSIZE_UNKNOWN:

39

print("Content size not stored in frame")

40

elif content_size == zstd.CONTENTSIZE_ERROR:

41

print("Error reading frame")

42

else:

43

print(f"Original size: {content_size} bytes")

44

print(f"Compressed size: {len(compressed)} bytes")

45

print(f"Compression ratio: {len(original_data)/len(compressed):.2f}:1")

46

```

47

48

### Frame Header Size

49

50

Get the size of a zstd frame header to skip to the compressed payload.

51

52

```python { .api }

53

def frame_header_size(data: bytes) -> int:

54

"""

55

Get the size of a zstd frame header.

56

57

Parameters:

58

- data: bytes, zstd frame data (at least frame header)

59

60

Returns:

61

int: Frame header size in bytes

62

"""

63

```

64

65

**Usage Example:**

66

67

```python

68

import zstandard as zstd

69

70

compressed_data = b"..." # zstd compressed data

71

72

# Get header size

73

header_size = zstd.frame_header_size(compressed_data)

74

print(f"Frame header size: {header_size} bytes")

75

76

# Split header and payload

77

header = compressed_data[:header_size]

78

payload = compressed_data[header_size:]

79

80

print(f"Header: {len(header)} bytes")

81

print(f"Payload: {len(payload)} bytes")

82

```

83

84

### Frame Parameters

85

86

Extract detailed parameters and metadata from a zstd frame header.

87

88

```python { .api }

89

def get_frame_parameters(data: bytes, format: int = FORMAT_ZSTD1) -> FrameParameters:

90

"""

91

Extract frame parameters from zstd frame header.

92

93

Parameters:

94

- data: bytes, zstd frame data (at least frame header)

95

- format: int, expected frame format (FORMAT_ZSTD1, FORMAT_ZSTD1_MAGICLESS)

96

97

Returns:

98

FrameParameters: Object containing frame metadata

99

"""

100

101

class FrameParameters:

102

"""Container for zstd frame parameters and metadata."""

103

104

@property

105

def content_size(self) -> int:

106

"""Original content size (-1 if unknown)."""

107

108

@property

109

def window_size(self) -> int:

110

"""Window size used for compression."""

111

112

@property

113

def dict_id(self) -> int:

114

"""Dictionary ID (0 if no dictionary)."""

115

116

@property

117

def has_checksum(self) -> bool:

118

"""Whether frame includes content checksum."""

119

```

120

121

**Usage Example:**

122

123

```python

124

import zstandard as zstd

125

126

# Create compressed data with various options

127

compressor = zstd.ZstdCompressor(

128

level=5,

129

write_content_size=True,

130

write_checksum=True,

131

write_dict_id=True

132

)

133

134

data = b"Sample data for frame analysis"

135

compressed = compressor.compress(data)

136

137

# Analyze frame parameters

138

params = zstd.get_frame_parameters(compressed)

139

140

print(f"Content size: {params.content_size}")

141

print(f"Window size: {params.window_size}")

142

print(f"Dictionary ID: {params.dict_id}")

143

print(f"Has checksum: {params.has_checksum}")

144

145

# Validate expectations

146

assert params.content_size == len(data)

147

assert params.has_checksum == True

148

```

149

150

### Frame Format Detection

151

152

Handle different zstd frame formats including standard and magicless frames.

153

154

**Usage Example:**

155

156

```python

157

import zstandard as zstd

158

159

# Standard frame with magic number

160

standard_compressor = zstd.ZstdCompressor()

161

standard_compressed = standard_compressor.compress(b"Standard frame data")

162

163

# Magicless frame

164

magicless_params = zstd.ZstdCompressionParameters(format=zstd.FORMAT_ZSTD1_MAGICLESS)

165

magicless_compressor = zstd.ZstdCompressor(compression_params=magicless_params)

166

magicless_compressed = magicless_compressor.compress(b"Magicless frame data")

167

168

# Analyze different formats

169

standard_params = zstd.get_frame_parameters(standard_compressed, zstd.FORMAT_ZSTD1)

170

magicless_params = zstd.get_frame_parameters(magicless_compressed, zstd.FORMAT_ZSTD1_MAGICLESS)

171

172

print("Standard frame:")

173

print(f" Content size: {standard_params.content_size}")

174

print(f" Window size: {standard_params.window_size}")

175

176

print("Magicless frame:")

177

print(f" Content size: {magicless_params.content_size}")

178

print(f" Window size: {magicless_params.window_size}")

179

```

180

181

### Multi-Frame Analysis

182

183

Analyze compressed data containing multiple zstd frames.

184

185

**Usage Example:**

186

187

```python

188

import zstandard as zstd

189

190

def analyze_multi_frame_data(data: bytes):

191

"""Analyze compressed data that may contain multiple frames."""

192

frames = []

193

offset = 0

194

195

while offset < len(data):

196

try:

197

# Try to get frame parameters

198

remaining_data = data[offset:]

199

params = zstd.get_frame_parameters(remaining_data)

200

201

# Get frame header size

202

header_size = zstd.frame_header_size(remaining_data)

203

204

# Calculate frame size (header + compressed payload)

205

# This is simplified - real implementation would need to parse the frame

206

if params.content_size > 0:

207

# Estimate compressed size (not exact)

208

estimated_compressed_size = params.content_size // 4 # rough estimate

209

frame_size = header_size + estimated_compressed_size

210

else:

211

# For unknown content size, would need full frame parsing

212

break

213

214

frame_info = {

215

'offset': offset,

216

'header_size': header_size,

217

'content_size': params.content_size,

218

'window_size': params.window_size,

219

'dict_id': params.dict_id,

220

'has_checksum': params.has_checksum

221

}

222

frames.append(frame_info)

223

224

offset += frame_size

225

226

except Exception as e:

227

print(f"Error analyzing frame at offset {offset}: {e}")

228

break

229

230

return frames

231

232

# Example usage

233

compressor = zstd.ZstdCompressor(write_content_size=True)

234

frame1 = compressor.compress(b"First frame data")

235

frame2 = compressor.compress(b"Second frame data")

236

frame3 = compressor.compress(b"Third frame data")

237

238

multi_frame_data = frame1 + frame2 + frame3

239

frames = analyze_multi_frame_data(multi_frame_data)

240

241

for i, frame in enumerate(frames):

242

print(f"Frame {i+1}:")

243

print(f" Offset: {frame['offset']}")

244

print(f" Header size: {frame['header_size']}")

245

print(f" Content size: {frame['content_size']}")

246

print(f" Window size: {frame['window_size']}")

247

```

248

249

### Frame Validation

250

251

Validate frame integrity and format without full decompression.

252

253

**Usage Example:**

254

255

```python

256

import zstandard as zstd

257

258

def validate_frame(data: bytes) -> dict:

259

"""Validate a zstd frame and return analysis results."""

260

result = {

261

'valid': False,

262

'error': None,

263

'analysis': None

264

}

265

266

try:

267

# Check minimum size

268

if len(data) < 4:

269

result['error'] = "Data too short for zstd frame"

270

return result

271

272

# Check magic number

273

if data[:4] != zstd.FRAME_HEADER:

274

result['error'] = "Invalid zstd magic number"

275

return result

276

277

# Get frame parameters

278

params = zstd.get_frame_parameters(data)

279

280

# Validate parameters

281

if params.content_size == zstd.CONTENTSIZE_ERROR:

282

result['error'] = "Error reading frame parameters"

283

return result

284

285

# Get header size

286

header_size = zstd.frame_header_size(data)

287

288

if header_size <= 0 or header_size > len(data):

289

result['error'] = f"Invalid header size: {header_size}"

290

return result

291

292

result['valid'] = True

293

result['analysis'] = {

294

'header_size': header_size,

295

'content_size': params.content_size,

296

'window_size': params.window_size,

297

'dict_id': params.dict_id,

298

'has_checksum': params.has_checksum,

299

'total_size': len(data)

300

}

301

302

except Exception as e:

303

result['error'] = str(e)

304

305

return result

306

307

# Example usage

308

compressor = zstd.ZstdCompressor(write_checksum=True)

309

valid_data = compressor.compress(b"Valid frame data")

310

invalid_data = b"Invalid frame data"

311

312

# Validate frames

313

valid_result = validate_frame(valid_data)

314

invalid_result = validate_frame(invalid_data)

315

316

print("Valid frame:", valid_result['valid'])

317

if valid_result['valid']:

318

analysis = valid_result['analysis']

319

print(f" Header size: {analysis['header_size']}")

320

print(f" Content size: {analysis['content_size']}")

321

print(f" Has checksum: {analysis['has_checksum']}")

322

323

print("Invalid frame:", invalid_result['valid'])

324

if not invalid_result['valid']:

325

print(f" Error: {invalid_result['error']}")

326

```

327

328

### Decompression Context Estimation

329

330

Estimate memory requirements for decompression without actually decompressing.

331

332

```python { .api }

333

def estimate_decompression_context_size() -> int:

334

"""

335

Estimate memory usage for decompression context.

336

337

Returns:

338

int: Estimated memory usage in bytes

339

"""

340

```

341

342

**Usage Example:**

343

344

```python

345

import zstandard as zstd

346

347

# Estimate memory usage

348

estimated_memory = zstd.estimate_decompression_context_size()

349

print(f"Estimated decompression context size: {estimated_memory} bytes")

350

351

# Use for memory planning

352

def plan_decompression(compressed_frames: list[bytes]) -> dict:

353

"""Plan memory usage for batch decompression."""

354

base_memory = zstd.estimate_decompression_context_size()

355

356

total_compressed = sum(len(frame) for frame in compressed_frames)

357

total_content_size = 0

358

359

for frame in compressed_frames:

360

try:

361

content_size = zstd.frame_content_size(frame)

362

if content_size > 0:

363

total_content_size += content_size

364

except:

365

# Estimate if content size unknown

366

total_content_size += len(frame) * 4 # rough estimate

367

368

return {

369

'base_memory': base_memory,

370

'total_compressed': total_compressed,

371

'estimated_decompressed': total_content_size,

372

'peak_memory_estimate': base_memory + total_content_size

373

}

374

375

# Example

376

frames = [compressed1, compressed2, compressed3]

377

plan = plan_decompression(frames)

378

print(f"Peak memory estimate: {plan['peak_memory_estimate']} bytes")

379

```

380

381

## Constants

382

383

Frame analysis uses several constants for special values and format identification:

384

385

```python { .api }

386

# Content size special values

387

CONTENTSIZE_UNKNOWN: int # Content size not stored in frame

388

CONTENTSIZE_ERROR: int # Error reading content size

389

390

# Frame format constants

391

FORMAT_ZSTD1: int # Standard zstd format with magic number

392

FORMAT_ZSTD1_MAGICLESS: int # Zstd format without magic number

393

394

# Frame header magic number

395

FRAME_HEADER: bytes # b"\x28\xb5\x2f\xfd"

396

MAGIC_NUMBER: int # Magic number as integer

397

```

398

399

## Performance Notes

400

401

- Frame analysis operations are very fast as they only read headers

402

- No decompression is performed, making these operations suitable for large-scale analysis

403

- Use frame analysis to validate data before attempting decompression

404

- Content size information enables memory pre-allocation for better performance

405

- Frame parameter analysis helps choose appropriate decompression settings