or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# PyBase64

1

2

Fast Base64 encoding/decoding library that provides a high-performance wrapper around the optimized libbase64 C library. PyBase64 offers the same API as Python's built-in base64 module for easy integration while delivering significantly faster performance through SIMD optimizations (AVX2, AVX512-VBMI, Neon) and native C implementations.

3

4

## Package Information

5

6

- **Package Name**: pybase64

7

- **Language**: Python

8

- **Installation**: `pip install pybase64`

9

- **Documentation**: https://pybase64.readthedocs.io/en/stable

10

- **License**: BSD-2-Clause

11

- **CLI Tool**: Available as `pybase64` command or `python -m pybase64`

12

13

## Core Imports

14

15

```python

16

import pybase64

17

```

18

19

For specific functions:

20

21

```python

22

from pybase64 import b64encode, b64decode, standard_b64encode, urlsafe_b64decode

23

```

24

25

## Basic Usage

26

27

```python

28

import pybase64

29

30

# Basic encoding/decoding

31

data = b'Hello, World!'

32

encoded = pybase64.b64encode(data)

33

decoded = pybase64.b64decode(encoded)

34

35

print(encoded) # b'SGVsbG8sIFdvcmxkIQ=='

36

print(decoded) # b'Hello, World!'

37

38

# URL-safe encoding

39

url_encoded = pybase64.urlsafe_b64encode(data)

40

url_decoded = pybase64.urlsafe_b64decode(url_encoded)

41

42

# Custom alphabet

43

custom_encoded = pybase64.b64encode(data, altchars=b'_:')

44

custom_decoded = pybase64.b64decode(custom_encoded, altchars=b'_:')

45

46

# Validation for security-critical applications

47

secure_decoded = pybase64.b64decode(encoded, validate=True)

48

49

# Version and performance info

50

print(pybase64.get_version()) # Shows SIMD optimizations in use

51

```

52

53

## Architecture

54

55

PyBase64 provides a dual-implementation architecture for optimal performance:

56

57

- **C Extension** (`_pybase64`): High-performance implementation using libbase64 with SIMD optimizations

58

- **Python Fallback** (`_fallback`): Pure Python implementation using built-in base64 module when C extension unavailable

59

- **Automatic Selection**: Runtime detection automatically chooses best available implementation

60

- **SIMD Detection**: Runtime CPU feature detection enables optimal instruction sets (AVX2, AVX512-VBMI, Neon)

61

62

This design ensures maximum performance when possible while maintaining compatibility across all Python environments including PyPy and free-threaded builds.

63

64

## Capabilities

65

66

### Core Encoding Functions

67

68

Primary Base64 encoding functions with full alphabet customization and optimal performance through C extensions.

69

70

```python { .api }

71

def b64encode(s: Buffer, altchars: str | Buffer | None = None) -> bytes:

72

"""

73

Encode bytes using Base64 alphabet.

74

75

Parameters:

76

- s: bytes-like object to encode

77

- altchars: optional 2-character string/bytes for custom alphabet (replaces '+' and '/')

78

79

Returns:

80

bytes: Base64 encoded data

81

82

Raises:

83

BufferError: if buffer is not C-contiguous

84

TypeError: for invalid input types

85

ValueError: for non-ASCII strings in altchars

86

"""

87

88

def b64encode_as_string(s: Buffer, altchars: str | Buffer | None = None) -> str:

89

"""

90

Encode bytes using Base64 alphabet, return as string.

91

92

Parameters:

93

- s: bytes-like object to encode

94

- altchars: optional 2-character string/bytes for custom alphabet

95

96

Returns:

97

str: Base64 encoded data as ASCII string

98

"""

99

100

def encodebytes(s: Buffer) -> bytes:

101

"""

102

Encode bytes with MIME-style line breaks every 76 characters.

103

104

Parameters:

105

- s: bytes-like object to encode

106

107

Returns:

108

bytes: Base64 encoded data with newlines per RFC 2045 (MIME)

109

"""

110

```

111

112

### Core Decoding Functions

113

114

Base64 decoding functions with validation options and alternative alphabet support for maximum security and flexibility.

115

116

```python { .api }

117

def b64decode(s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytes:

118

"""

119

Decode Base64 encoded data.

120

121

Parameters:

122

- s: string or bytes-like object to decode

123

- altchars: optional 2-character alternative alphabet

124

- validate: if True, strictly validate input (recommended for security)

125

126

Returns:

127

bytes: decoded data

128

129

Raises:

130

binascii.Error: for invalid padding or characters (when validate=True)

131

"""

132

133

def b64decode_as_bytearray(s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytearray:

134

"""

135

Decode Base64 encoded data, return as bytearray.

136

137

Parameters:

138

- s: string or bytes-like object to decode

139

- altchars: optional 2-character alternative alphabet

140

- validate: if True, strictly validate input

141

142

Returns:

143

bytearray: decoded data as mutable bytearray

144

145

Raises:

146

binascii.Error: for invalid padding or characters (when validate=True)

147

"""

148

```

149

150

### Standard Base64 Functions

151

152

Convenience functions for standard Base64 alphabet encoding/decoding, compatible with Python's base64 module.

153

154

```python { .api }

155

def standard_b64encode(s: Buffer) -> bytes:

156

"""

157

Encode using standard Base64 alphabet (+/).

158

159

Parameters:

160

- s: bytes-like object to encode

161

162

Returns:

163

bytes: standard Base64 encoded data

164

"""

165

166

def standard_b64decode(s: str | Buffer) -> bytes:

167

"""

168

Decode standard Base64 encoded data.

169

170

Parameters:

171

- s: string or bytes-like object to decode

172

173

Returns:

174

bytes: decoded data

175

176

Raises:

177

binascii.Error: for invalid input

178

"""

179

```

180

181

### URL-Safe Base64 Functions

182

183

URL and filesystem safe Base64 encoding/decoding using modified alphabet (-_ instead of +/) for web applications and file names.

184

185

```python { .api }

186

def urlsafe_b64encode(s: Buffer) -> bytes:

187

"""

188

Encode using URL-safe Base64 alphabet (-_).

189

190

Parameters:

191

- s: bytes-like object to encode

192

193

Returns:

194

bytes: URL-safe Base64 encoded data

195

"""

196

197

def urlsafe_b64decode(s: str | Buffer) -> bytes:

198

"""

199

Decode URL-safe Base64 encoded data.

200

201

Parameters:

202

- s: string or bytes-like object to decode

203

204

Returns:

205

bytes: decoded data

206

207

Raises:

208

binascii.Error: for invalid input

209

"""

210

```

211

212

### Utility Functions

213

214

Version and license information functions for runtime introspection and compliance reporting.

215

216

```python { .api }

217

def get_version() -> str:

218

"""

219

Get pybase64 version with optimization status.

220

221

Returns:

222

str: version string with C extension and SIMD status

223

e.g., "1.4.2 (C extension active - AVX2)"

224

"""

225

226

def get_license_text() -> str:

227

"""

228

Get complete license information.

229

230

Returns:

231

str: license text including libbase64 license information

232

"""

233

```

234

235

### SIMD Detection Functions

236

237

Internal functions for SIMD optimization control and introspection (available when C extension is active).

238

239

```python { .api }

240

def _get_simd_flags_compile() -> int:

241

"""

242

Get compile-time SIMD flags used when building the C extension.

243

244

Returns:

245

int: bitmask of SIMD instruction sets available at compile time

246

"""

247

248

def _get_simd_flags_runtime() -> int:

249

"""

250

Get runtime SIMD flags detected on current CPU.

251

252

Returns:

253

int: bitmask of SIMD instruction sets available at runtime

254

"""

255

256

def _get_simd_name(flags: int) -> str:

257

"""

258

Get human-readable name for SIMD instruction set.

259

260

Parameters:

261

- flags: SIMD flags bitmask

262

263

Returns:

264

str: SIMD instruction set name (e.g., "AVX2", "fallback")

265

"""

266

267

def _get_simd_path() -> int:

268

"""

269

Get currently active SIMD path flags.

270

271

Returns:

272

int: active SIMD flags for current execution path

273

"""

274

275

def _set_simd_path(flags: int) -> None:

276

"""

277

Set SIMD path for optimization (advanced users only).

278

279

Parameters:

280

- flags: SIMD flags to activate

281

282

Note: Only available when C extension is active

283

"""

284

```

285

286

### Command-Line Interface

287

288

PyBase64 provides a comprehensive command-line tool for encoding, decoding, and benchmarking Base64 operations.

289

290

```bash { .api }

291

# Main command with version and help

292

pybase64 --version

293

pybase64 --license

294

pybase64 -h

295

296

# Encoding subcommand

297

pybase64 encode <input_file> [-o <output_file>] [-u|--url] [-a <altchars>]

298

299

# Decoding subcommand

300

pybase64 decode <input_file> [-o <output_file>] [-u|--url] [-a <altchars>] [--no-validation]

301

302

# Benchmarking subcommand

303

pybase64 benchmark <input_file> [-d <duration>]

304

```

305

306

The CLI can also be invoked using Python module syntax:

307

308

```bash { .api }

309

python -m pybase64 <subcommand> [arguments...]

310

```

311

312

### Module Attributes

313

314

Package version and exported symbols for version checking and introspection.

315

316

```python { .api }

317

__version__: str # Package version string

318

__all__: tuple[str, ...] # Exported public API symbols

319

```

320

321

## Type Definitions

322

323

```python { .api }

324

# Type alias for bytes-like objects (version-dependent import)

325

if sys.version_info < (3, 12):

326

from typing_extensions import Buffer

327

else:

328

from collections.abc import Buffer

329

330

# Protocol for decode functions

331

class Decode(Protocol):

332

__name__: str

333

__module__: str

334

def __call__(self, s: str | Buffer, altchars: str | Buffer | None = None, validate: bool = False) -> bytes: ...

335

336

# Protocol for encode functions

337

class Encode(Protocol):

338

__name__: str

339

__module__: str

340

def __call__(self, s: Buffer, altchars: Buffer | None = None) -> bytes: ...

341

342

# Protocol for encodebytes-style functions

343

class EncodeBytes(Protocol):

344

__name__: str

345

__module__: str

346

def __call__(self, s: Buffer) -> bytes: ...

347

```

348

349

## Usage Examples

350

351

### Performance-Optimized Decoding

352

353

```python

354

import pybase64

355

356

# For maximum security and performance, use validate=True

357

# This enables optimized validation in the C extension

358

data = b'SGVsbG8sIFdvcmxkIQ=='

359

decoded = pybase64.b64decode(data, validate=True)

360

```

361

362

### Custom Alphabet Usage

363

364

```python

365

import pybase64

366

367

# Create data with custom alphabet for specific protocols

368

data = b'binary data here'

369

encoded = pybase64.b64encode(data, altchars=b'@&')

370

# Result uses @ and & instead of + and /

371

372

# Decode with same custom alphabet

373

decoded = pybase64.b64decode(encoded, altchars=b'@&')

374

```

375

376

### MIME-Compatible Encoding

377

378

```python

379

import pybase64

380

381

# Encode with line breaks for email/MIME compatibility

382

large_data = b'x' * 200 # Large binary data

383

mime_encoded = pybase64.encodebytes(large_data)

384

# Result has newlines every 76 characters per RFC 2045

385

```

386

387

### Runtime Performance Information

388

389

```python

390

import pybase64

391

392

# Check if C extension and SIMD optimizations are active

393

version_info = pybase64.get_version()

394

print(version_info)

395

# Output examples:

396

# "1.4.2 (C extension active - AVX2)"

397

# "1.4.2 (C extension inactive)" # Fallback mode

398

```

399

400

### Command-Line Usage Examples

401

402

```bash

403

# Encode a file using standard Base64

404

pybase64 encode input.txt -o encoded.txt

405

406

# Decode with validation (recommended for security)

407

pybase64 decode encoded.txt -o decoded.txt

408

409

# URL-safe encoding for web applications

410

pybase64 encode data.bin -u -o urlsafe.txt

411

412

# Custom alphabet encoding

413

pybase64 encode data.bin -a '@&' -o custom.txt

414

415

# Benchmark performance on your system

416

pybase64 benchmark test_data.bin

417

418

# Pipe operations (using stdin/stdout)

419

echo "Hello World" | pybase64 encode -

420

cat encoded.txt | pybase64 decode - > decoded.txt

421

422

# Check version and license

423

pybase64 --version

424

pybase64 --license

425

426

# Using Python module syntax

427

python -m pybase64 encode input.txt

428

```

429

430

## Error Handling

431

432

All decoding functions may raise `binascii.Error` for:

433

- Incorrect Base64 padding

434

- Invalid characters in input (when `validate=True`)

435

- Malformed Base64 strings

436

437

Encoding functions may raise:

438

- `BufferError` for non-contiguous memory buffers

439

- `TypeError` for invalid input types

440

- `ValueError` for non-ASCII characters in custom alphabets

441

442

## Performance Notes

443

444

- Use `validate=True` for security-critical applications - it's optimized in the C extension

445

- C extension provides 5-20x performance improvement over Python's built-in base64

446

- SIMD optimizations (AVX2, AVX512-VBMI, Neon) are automatically detected and used when available

447

- For maximum performance, use `b64decode` and `b64encode` directly rather than wrapper functions

448

- PyPy and free-threaded Python builds are fully supported with automatic fallback