or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# zipp

1

2

A pathlib-compatible zipfile object wrapper that provides an intuitive, Path-like interface for working with ZIP archives. This library serves as the official backport of the standard library Path object for zipfile operations, enabling seamless integration between file system operations and ZIP archive manipulation using familiar pathlib syntax.

3

4

## Package Information

5

6

- **Package Name**: zipp

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install zipp`

10

- **Python Version**: >= 3.9

11

12

## Core Imports

13

14

```python

15

import zipp

16

```

17

18

Standard usage:

19

20

```python

21

from zipp import Path

22

```

23

24

Advanced usage:

25

26

```python

27

from zipp import Path, CompleteDirs, FastLookup

28

from zipp.glob import Translator

29

```

30

31

Compatibility functions:

32

33

```python

34

from zipp.compat.py310 import text_encoding

35

```

36

37

## Basic Usage

38

39

```python

40

import zipfile

41

from zipp import Path

42

43

# Create or open a zip file

44

with zipfile.ZipFile('example.zip', 'w') as zf:

45

zf.writestr('data/file1.txt', 'content of file1')

46

zf.writestr('data/subdir/file2.txt', 'content of file2')

47

zf.writestr('config.json', '{"key": "value"}')

48

49

# Use zipp.Path to work with the zip file

50

zip_path = Path('example.zip')

51

52

# Check if paths exist

53

print(zip_path.exists()) # True

54

print((zip_path / 'data').exists()) # True

55

print((zip_path / 'missing.txt').exists()) # False

56

57

# Read file contents

58

config_path = zip_path / 'config.json'

59

config_content = config_path.read_text()

60

print(config_content) # {"key": "value"}

61

62

# Iterate through directory contents

63

data_dir = zip_path / 'data'

64

for item in data_dir.iterdir():

65

print(f"{item.name}: {'directory' if item.is_dir() else 'file'}")

66

67

# Use glob patterns to find files

68

txt_files = list(zip_path.glob('**/*.txt'))

69

for txt_file in txt_files:

70

print(f"Found: {txt_file}")

71

content = txt_file.read_text()

72

print(f"Content: {content}")

73

```

74

75

## Architecture

76

77

zipp implements a layered architecture that extends zipfile functionality:

78

79

- **Path**: Main user-facing class providing pathlib-compatible interface

80

- **CompleteDirs**: ZipFile subclass that automatically includes implied directories in file listings

81

- **FastLookup**: Performance-optimized subclass with cached name lookups

82

- **Translator**: Glob pattern to regex conversion for file pattern matching

83

84

This design ensures that ZIP archives behave consistently with file system paths while maintaining high performance for large archives.

85

86

## Capabilities

87

88

### Path Operations

89

90

Core pathlib-compatible interface for navigating and manipulating paths within ZIP archives.

91

92

```python { .api }

93

class Path:

94

def __init__(self, root, at: str = ""):

95

"""

96

Construct a Path from a ZipFile or filename.

97

98

Note: When the source is an existing ZipFile object, its type

99

(__class__) will be mutated to a specialized type. If the caller

100

wishes to retain the original type, create a separate ZipFile

101

object or pass a filename.

102

103

Args:

104

root: ZipFile object or path to zip file

105

at (str): Path within the zip file, defaults to root

106

"""

107

108

def __eq__(self, other) -> bool:

109

"""

110

Test path equality.

111

112

Args:

113

other: Other object to compare

114

115

Returns:

116

bool: True if paths are equal, NotImplemented for different types

117

"""

118

119

def __hash__(self) -> int:

120

"""Return hash of path for use in sets and dicts."""

121

122

@property

123

def name(self) -> str:

124

"""Name of the path entry (final component)."""

125

126

@property

127

def suffix(self) -> str:

128

"""File suffix (extension including the dot)."""

129

130

@property

131

def suffixes(self) -> list[str]:

132

"""List of all file suffixes."""

133

134

@property

135

def stem(self) -> str:

136

"""Filename without the final suffix."""

137

138

@property

139

def filename(self) -> pathlib.Path:

140

"""Full filesystem path including zip file path and internal path."""

141

142

@property

143

def parent(self) -> "Path":

144

"""Parent directory path within the ZIP file."""

145

146

def joinpath(self, *other) -> "Path":

147

"""

148

Join path components.

149

150

Args:

151

*other: Path components to join

152

153

Returns:

154

Path: New Path object with joined components

155

"""

156

157

def __truediv__(self, other) -> "Path":

158

"""

159

Path joining using / operator.

160

161

Args:

162

other: Path component to join

163

164

Returns:

165

Path: New Path object with joined component

166

"""

167

168

def relative_to(self, other, *extra) -> str:

169

"""

170

Return relative path from other path.

171

172

Args:

173

other: Base path for relative calculation

174

*extra: Additional path components for base

175

176

Returns:

177

str: Relative path string

178

"""

179

180

def __str__(self) -> str:

181

"""String representation combining zip filename and internal path."""

182

183

def __repr__(self) -> str:

184

"""Detailed string representation showing class, zip file, and internal path."""

185

```

186

187

### File Operations

188

189

Read and write operations for files within ZIP archives.

190

191

```python { .api }

192

def open(self, mode: str = 'r', *args, pwd=None, **kwargs):

193

"""

194

Open file for reading or writing following pathlib.Path.open() semantics.

195

196

Text mode arguments are passed through to io.TextIOWrapper().

197

198

Args:

199

mode (str): File mode ('r', 'rb', 'w', 'wb'). Defaults to 'r'.

200

pwd (bytes, optional): Password for encrypted ZIP files

201

*args: Additional positional arguments for TextIOWrapper (text mode only)

202

**kwargs: Additional keyword arguments for TextIOWrapper (text mode only)

203

204

Returns:

205

IO: File-like object (TextIOWrapper for text mode, raw stream for binary)

206

207

Raises:

208

IsADirectoryError: If path is a directory

209

FileNotFoundError: If file doesn't exist in read mode

210

ValueError: If encoding args provided for binary mode

211

"""

212

213

def read_text(self, *args, **kwargs) -> str:

214

"""

215

Read file contents as text with proper encoding handling.

216

217

Args:

218

*args: Positional arguments for text encoding (encoding, errors, newline)

219

**kwargs: Keyword arguments for text processing

220

221

Returns:

222

str: File contents as decoded text

223

"""

224

225

def read_bytes(self) -> bytes:

226

"""

227

Read file contents as bytes.

228

229

Returns:

230

bytes: Raw file contents without any encoding

231

"""

232

```

233

234

### Path Testing

235

236

Methods to test path properties and existence.

237

238

```python { .api }

239

def exists(self) -> bool:

240

"""Check if path exists in the zip file."""

241

242

def is_file(self) -> bool:

243

"""Check if path is a file."""

244

245

def is_dir(self) -> bool:

246

"""Check if path is a directory."""

247

248

def is_symlink(self) -> bool:

249

"""Check if path is a symbolic link."""

250

```

251

252

### Directory Operations

253

254

Navigate and list directory contents within ZIP archives.

255

256

```python { .api }

257

def iterdir(self) -> Iterator["Path"]:

258

"""

259

Iterate over immediate children of this directory.

260

261

Returns:

262

Iterator[Path]: Path objects for immediate directory contents only

263

264

Raises:

265

ValueError: If path is not a directory

266

"""

267

```

268

269

### Pattern Matching

270

271

Find files using glob patterns and path matching.

272

273

```python { .api }

274

def match(self, path_pattern: str) -> bool:

275

"""

276

Test if path matches the given pattern using pathlib-style matching.

277

278

Args:

279

path_pattern (str): Pattern to match against (e.g., '*.txt', 'data/*')

280

281

Returns:

282

bool: True if path matches pattern

283

"""

284

285

def glob(self, pattern: str) -> Iterator["Path"]:

286

"""

287

Find all paths matching a glob pattern starting from this path.

288

289

Args:

290

pattern (str): Glob pattern to match (e.g., '*.txt', 'data/*.json')

291

292

Returns:

293

Iterator[Path]: Path objects matching the pattern

294

295

Raises:

296

ValueError: If pattern is empty or invalid

297

"""

298

299

def rglob(self, pattern: str) -> Iterator["Path"]:

300

"""

301

Recursively find all paths matching a glob pattern.

302

303

Equivalent to calling glob(f'**/{pattern}').

304

305

Args:

306

pattern (str): Glob pattern to match recursively

307

308

Returns:

309

Iterator[Path]: Path objects matching the pattern recursively

310

"""

311

```

312

313

### Advanced ZipFile Classes

314

315

Enhanced ZipFile subclasses for specialized use cases.

316

317

```python { .api }

318

class InitializedState:

319

"""

320

Mix-in to save the initialization state for pickling.

321

322

Preserves constructor arguments for proper serialization/deserialization.

323

"""

324

325

def __init__(self, *args, **kwargs):

326

"""Initialize and save constructor arguments."""

327

328

def __getstate__(self):

329

"""Return state for pickling."""

330

331

def __setstate__(self, state):

332

"""Restore state from pickle."""

333

334

class CompleteDirs(InitializedState, zipfile.ZipFile):

335

"""

336

ZipFile subclass that ensures implied directories are included.

337

338

Automatically includes parent directories for files in the namelist,

339

enabling proper directory traversal even when directories aren't

340

explicitly stored in the ZIP file.

341

"""

342

343

@classmethod

344

def make(cls, source):

345

"""

346

Create appropriate CompleteDirs subclass from source.

347

348

Args:

349

source: ZipFile object or filename

350

351

Returns:

352

CompleteDirs: CompleteDirs or FastLookup instance

353

"""

354

355

@classmethod

356

def inject(cls, zf: zipfile.ZipFile) -> zipfile.ZipFile:

357

"""

358

Inject directory entries for implied directories.

359

360

Args:

361

zf (zipfile.ZipFile): Writable ZipFile to modify

362

363

Returns:

364

zipfile.ZipFile: Modified zip file with directory entries

365

"""

366

367

def namelist(self) -> list[str]:

368

"""Return file list including implied directories."""

369

370

def resolve_dir(self, name: str) -> str:

371

"""

372

Resolve directory name with proper trailing slash.

373

374

Args:

375

name (str): Directory name to resolve

376

377

Returns:

378

str: Directory name with trailing slash if it's a directory

379

"""

380

381

def getinfo(self, name: str) -> zipfile.ZipInfo:

382

"""

383

Get ZipInfo for file, including implied directories.

384

385

Args:

386

name (str): File or directory name

387

388

Returns:

389

zipfile.ZipInfo: File information object

390

391

Raises:

392

KeyError: If file doesn't exist and isn't an implied directory

393

"""

394

395

@staticmethod

396

def _implied_dirs(names: list[str]):

397

"""

398

Generate implied parent directories from file list.

399

400

Args:

401

names (list[str]): List of file names in ZIP

402

403

Returns:

404

Iterator[str]: Implied directory names with trailing slashes

405

"""

406

407

class FastLookup(CompleteDirs):

408

"""

409

CompleteDirs subclass with cached lookups for performance.

410

411

Uses functools.cached_property for efficient repeated access

412

to namelist and name set operations.

413

"""

414

415

def namelist(self) -> list[str]:

416

"""Cached access to file list."""

417

418

@property

419

def _namelist(self) -> list[str]:

420

"""Cached property for namelist."""

421

422

def _name_set(self) -> set[str]:

423

"""Cached access to name set."""

424

425

@property

426

def _name_set_prop(self) -> set[str]:

427

"""Cached property for name set."""

428

```

429

430

### Pattern Translation

431

432

Convert glob patterns to regular expressions for file matching.

433

434

```python { .api }

435

class Translator:

436

"""

437

Translate glob patterns to regex patterns for ZIP file path matching.

438

439

Handles platform-specific path separators and converts shell-style

440

wildcards into regular expressions suitable for matching ZIP entries.

441

"""

442

443

def __init__(self, seps: str = None):

444

"""

445

Initialize translator with path separators.

446

447

Args:

448

seps (str, optional): Path separator characters.

449

Defaults to os.sep + os.altsep if available.

450

451

Raises:

452

AssertionError: If separators are invalid or empty

453

"""

454

455

def translate(self, pattern: str) -> str:

456

"""

457

Convert glob pattern to regex.

458

459

Args:

460

pattern (str): Glob pattern to convert (e.g., '*.txt', '**/data/*.json')

461

462

Returns:

463

str: Regular expression pattern with full match semantics

464

465

Raises:

466

ValueError: If ** appears incorrectly in pattern (not alone in path segment)

467

"""

468

469

def extend(self, pattern: str) -> str:

470

"""

471

Extend regex for pattern-wide concerns.

472

473

Applies non-matching group for newline matching and fullmatch semantics.

474

475

Args:

476

pattern (str): Base regex pattern

477

478

Returns:

479

str: Extended regex with (?s:pattern)\\z format

480

"""

481

482

def match_dirs(self, pattern: str) -> str:

483

"""

484

Ensure ZIP directory names are matched.

485

486

ZIP directories always end with '/', this makes patterns match

487

both with and without trailing slash.

488

489

Args:

490

pattern (str): Regex pattern

491

492

Returns:

493

str: Pattern with optional trailing slash

494

"""

495

496

def translate_core(self, pattern: str) -> str:

497

"""

498

Core glob to regex translation logic.

499

500

Args:

501

pattern (str): Glob pattern

502

503

Returns:

504

str: Base regex pattern before extension

505

"""

506

507

def replace(self, match) -> str:

508

"""

509

Perform regex replacements for glob wildcards.

510

511

Args:

512

match: Regex match object from separate()

513

514

Returns:

515

str: Replacement string for the match

516

"""

517

518

def restrict_rglob(self, pattern: str) -> None:

519

"""

520

Validate ** usage in pattern.

521

522

Args:

523

pattern (str): Glob pattern to validate

524

525

Raises:

526

ValueError: If ** appears in partial path segments

527

"""

528

529

def star_not_empty(self, pattern: str) -> str:

530

"""

531

Ensure * will not match empty segments.

532

533

Args:

534

pattern (str): Glob pattern

535

536

Returns:

537

str: Modified pattern where * becomes ?*

538

"""

539

540

def separate(pattern: str):

541

"""

542

Separate character sets to avoid translating their contents.

543

544

Args:

545

pattern (str): Glob pattern with potential character sets

546

547

Returns:

548

Iterator: Match objects for pattern segments

549

"""

550

```

551

552

## Usage Examples

553

554

### Working with Complex Directory Structures

555

556

```python

557

from zipp import Path

558

import zipfile

559

560

# Create a zip with complex structure

561

with zipfile.ZipFile('project.zip', 'w') as zf:

562

zf.writestr('src/main.py', 'print("Hello World")')

563

zf.writestr('src/utils/helpers.py', 'def helper(): pass')

564

zf.writestr('tests/test_main.py', 'def test_main(): assert True')

565

zf.writestr('docs/README.md', '# Project Documentation')

566

zf.writestr('config/settings.json', '{"debug": true}')

567

568

# Navigate the zip file structure

569

project = Path('project.zip')

570

571

# Find all Python files

572

python_files = list(project.rglob('*.py'))

573

print(f"Found {len(python_files)} Python files:")

574

for py_file in python_files:

575

print(f" {py_file}")

576

577

# Read configuration

578

config = project / 'config' / 'settings.json'

579

if config.exists():

580

settings = config.read_text()

581

print(f"Settings: {settings}")

582

583

# List directory contents with details

584

src_dir = project / 'src'

585

print(f"Contents of {src_dir}:")

586

for item in src_dir.iterdir():

587

item_type = "directory" if item.is_dir() else "file"

588

print(f" {item.name} ({item_type})")

589

```

590

591

### Pattern Matching and Filtering

592

593

```python

594

from zipp import Path

595

596

zip_path = Path('archive.zip')

597

598

# Find files by extension

599

text_files = list(zip_path.glob('**/*.txt'))

600

image_files = list(zip_path.glob('**/*.{jpg,png,gif}'))

601

602

# Find files in specific directories

603

src_files = list(zip_path.glob('src/**/*'))

604

test_files = list(zip_path.glob('**/test_*.py'))

605

606

# Check for specific patterns

607

has_readme = any(zip_path.glob('**/README*'))

608

config_files = list(zip_path.glob('**/config.*'))

609

610

print(f"Text files: {len(text_files)}")

611

print(f"Image files: {len(image_files)}")

612

print(f"Has README: {has_readme}")

613

```

614

615

### Error Handling

616

617

```python

618

from zipp import Path

619

620

try:

621

zip_path = Path('example.zip')

622

623

# Check if file exists before reading

624

target_file = zip_path / 'data' / 'important.txt'

625

if target_file.exists():

626

content = target_file.read_text()

627

print(content)

628

else:

629

print("File not found in archive")

630

631

# Handle directory operations

632

try:

633

directory = zip_path / 'folder'

634

with directory.open('r') as f: # This will raise IsADirectoryError

635

content = f.read()

636

except IsADirectoryError:

637

print("Cannot open directory as file")

638

# List directory contents instead

639

for item in directory.iterdir():

640

print(f"Directory contains: {item.name}")

641

642

except FileNotFoundError:

643

print("Zip file not found")

644

except Exception as e:

645

print(f"Error working with zip file: {e}")

646

```

647

648

### Utility Functions

649

650

Low-level utility functions for path manipulation and data processing.

651

652

```python { .api }

653

def _parents(path: str):

654

"""

655

Generate all parent paths of the given path.

656

657

Args:

658

path (str): Path with posixpath.sep-separated elements

659

660

Returns:

661

Iterator[str]: Parent paths in order from immediate to root

662

663

Examples:

664

>>> list(_parents('b/d/f/'))

665

['b/d', 'b']

666

>>> list(_parents('b'))

667

[]

668

"""

669

670

def _ancestry(path: str):

671

"""

672

Generate all elements of a path including itself.

673

674

Args:

675

path (str): Path with posixpath.sep-separated elements

676

677

Returns:

678

Iterator[str]: Path elements from full path to root

679

680

Examples:

681

>>> list(_ancestry('b/d/f/'))

682

['b/d/f', 'b/d', 'b']

683

"""

684

685

def _difference(minuend, subtrahend):

686

"""

687

Return items in minuend not in subtrahend, retaining order.

688

689

Uses O(1) lookup for efficient filtering of large sequences.

690

691

Args:

692

minuend: Items to filter from

693

subtrahend: Items to exclude

694

695

Returns:

696

Iterator: Filtered items in original order

697

"""

698

699

def _dedupe(iterable):

700

"""

701

Deduplicate an iterable in original order.

702

703

Implemented as dict.fromkeys for efficiency.

704

705

Args:

706

iterable: Items to deduplicate

707

708

Returns:

709

dict_keys: Unique items in original order

710

"""

711

```

712

713

### Compatibility Functions

714

715

Cross-version compatibility utilities for different Python versions.

716

717

```python { .api }

718

def text_encoding(encoding=None, stacklevel=2):

719

"""

720

Handle text encoding with proper warnings (Python 3.10+ compatibility).

721

722

Args:

723

encoding (str, optional): Text encoding to use

724

stacklevel (int): Stack level for warnings

725

726

Returns:

727

str: Encoding string to use for text operations

728

"""

729

730

def save_method_args(method):

731

"""

732

Decorator to save method arguments for serialization.

733

734

Used by InitializedState mixin for pickle support.

735

736

Args:

737

method: Method to wrap

738

739

Returns:

740

function: Wrapped method that saves args/kwargs

741

"""

742

```

743

744

## Types

745

746

```python { .api }

747

class Path:

748

"""

749

A pathlib-compatible interface for zip file paths.

750

751

Main user-facing class that provides familiar pathlib.Path-like

752

operations for navigating and manipulating ZIP file contents.

753

"""

754

755

class CompleteDirs(InitializedState, zipfile.ZipFile):

756

"""

757

ZipFile subclass ensuring implied directories are included.

758

759

Extends zipfile.ZipFile to automatically handle parent directories

760

that aren't explicitly stored in ZIP files.

761

"""

762

763

class FastLookup(CompleteDirs):

764

"""

765

Performance-optimized CompleteDirs with cached operations.

766

767

Uses functools.cached_property for efficient repeated access

768

to file listings and lookups in large ZIP archives.

769

"""

770

771

class Translator:

772

"""

773

Glob pattern to regex translator for ZIP file matching.

774

775

Converts shell-style wildcard patterns into regular expressions

776

suitable for matching ZIP file paths.

777

"""

778

779

class InitializedState:

780

"""

781

Mixin class for preserving initialization state in pickle operations.

782

783

Saves constructor arguments to enable proper serialization and

784

deserialization of ZipFile subclasses.

785

"""

786

787

# Exception types that may be raised

788

IsADirectoryError: Raised when trying to open a directory as a file

789

FileNotFoundError: Raised when a file doesn't exist

790

ValueError: Raised for invalid patterns or operations

791

KeyError: Raised for missing zip file entries (handled internally)

792

TypeError: Raised when zipfile has no filename for certain operations

793

```