CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-torrentool

A Python library for creating, reading, and modifying torrent files with bencoding utilities and CLI tools.

Overview
Eval results
Files

bencoding.mddocs/

Bencoding

Complete bencoding implementation for encoding and decoding BitTorrent data structures. Bencode is the encoding format used by the BitTorrent protocol for storing and transmitting structured data, supporting strings, integers, lists, and dictionaries.

Capabilities

Encoding to Bencode

Convert Python objects to bencoded byte strings.

@classmethod
def encode(cls, value: TypeEncodable) -> bytes:
    """
    Encode Python object to bencoded bytes.
    
    Supports encoding of strings, integers, lists, tuples, sets, dictionaries,
    bytes, and bytearrays. Dictionaries are automatically sorted by key as
    required by bencode specification.
    
    Parameters:
    - value (TypeEncodable): Python object to encode
                           Union[str, int, list, set, tuple, dict, bytes, bytearray]
    
    Returns:
    bytes: Bencoded data ready for storage or transmission
    
    Raises:
    BencodeEncodingError: If value type cannot be encoded
    """

Decoding from Bencode

Parse bencoded data back into Python objects.

@classmethod
def decode(cls, encoded: bytes, *, byte_keys: Set[str] = None) -> TypeEncodable:
    """
    Decode bencoded bytes to Python objects.
    
    Automatically reconstructs the original data structure from bencoded format.
    Handles special case of binary data that should remain as bytes rather than
    being decoded as UTF-8 strings.
    
    Parameters:
    - encoded (bytes): Bencoded data to decode
    - byte_keys (Set[str], optional): Keys whose values should remain as bytes
                                     instead of being decoded as UTF-8 strings.
                                     Commonly used for 'pieces' field in torrents.
    
    Returns:
    TypeEncodable: Decoded Python object (dict, list, str, int, or bytes)
    
    Raises:
    BencodeDecodingError: If data is malformed or cannot be parsed
    """

String Decoding

Decode bencoded strings directly.

@classmethod
def read_string(cls, string: Union[str, bytes], *, byte_keys: Set[str] = None) -> TypeEncodable:
    """
    Decode bencoded string or byte string.
    
    Convenience method for decoding bencoded data provided as string.
    Automatically converts string to bytes before decoding.
    
    Parameters:
    - string (Union[str, bytes]): Bencoded data as string or bytes
    - byte_keys (Set[str], optional): Keys to keep as bytes rather than decode as UTF-8
    
    Returns:
    TypeEncodable: Decoded Python object
    
    Raises:
    BencodeDecodingError: If string is malformed
    """

File Decoding

Decode bencoded files directly from disk.

@classmethod
def read_file(cls, filepath: Union[str, Path], *, byte_keys: Set[str] = None) -> TypeEncodable:
    """
    Decode bencoded data from file.
    
    Reads entire file into memory and decodes the bencoded content.
    Commonly used for reading .torrent files.
    
    Parameters:
    - filepath (Union[str, Path]): Path to file containing bencoded data
    - byte_keys (Set[str], optional): Keys to preserve as bytes
    
    Returns:
    TypeEncodable: Decoded file contents as Python objects
    
    Raises:
    BencodeDecodingError: If file contains malformed bencoded data
    FileNotFoundError: If file does not exist
    """

Usage Examples

Basic Encoding and Decoding

from torrentool.bencode import Bencode

# Encode various Python objects
data = {
    'announce': 'http://tracker.example.com/announce',
    'info': {
        'name': 'example.txt',
        'length': 12345,
        'pieces': b'\x01\x02\x03\x04\x05'  # Binary hash data
    },
    'trackers': ['http://t1.com', 'http://t2.com'],
    'created': 1234567890
}

# Encode to bencoded bytes
encoded = Bencode.encode(data)
print(f"Encoded size: {len(encoded)} bytes")

# Decode back to Python objects
# Specify 'pieces' as byte key to prevent UTF-8 decoding
decoded = Bencode.decode(encoded, byte_keys={'pieces'})
print(f"Decoded: {decoded}")

# Verify round-trip
assert decoded == data

Working with Torrent Files

from torrentool.bencode import Bencode
from pathlib import Path

# Read a .torrent file
torrent_path = Path('example.torrent')
torrent_data = Bencode.read_file(torrent_path, byte_keys={'pieces'})

print(f"Torrent name: {torrent_data['info']['name']}")
print(f"Announce URL: {torrent_data['announce']}")
print(f"Piece length: {torrent_data['info']['piece length']}")

# Modify and save back
torrent_data['comment'] = 'Modified by Python script'
encoded_data = Bencode.encode(torrent_data)

with open('modified.torrent', 'wb') as f:
    f.write(encoded_data)

String and Bytes Handling

from torrentool.bencode import Bencode

# Working with strings vs bytes
string_data = "Hello, world!"
bytes_data = b"Binary data \x00\x01\x02"

# Both can be encoded
encoded_string = Bencode.encode(string_data)
encoded_bytes = Bencode.encode(bytes_data)

# Decode - strings come back as strings, bytes as bytes
decoded_string = Bencode.decode(encoded_string)  # Returns str
decoded_bytes = Bencode.decode(encoded_bytes)    # Returns bytes

print(f"String: {decoded_string}")
print(f"Bytes: {decoded_bytes}")

# Complex structure with mixed types
mixed_data = {
    'text': 'This is text',
    'binary': b'\x89PNG\r\n\x1a\n',  # PNG header
    'number': 42,
    'list': ['item1', 'item2', b'binary_item']
}

encoded_mixed = Bencode.encode(mixed_data)
decoded_mixed = Bencode.decode(encoded_mixed)

Error Handling

from torrentool.bencode import Bencode, BencodeDecodingError, BencodeEncodingError

# Handle encoding errors
try:
    invalid_data = object()  # Objects cannot be encoded
    Bencode.encode(invalid_data)
except BencodeEncodingError as e:
    print(f"Encoding failed: {e}")

# Handle decoding errors
try:
    malformed_data = b"invalid bencode data"
    Bencode.decode(malformed_data)
except BencodeDecodingError as e:
    print(f"Decoding failed: {e}")

# Graceful handling of corrupted files
try:
    corrupted_torrent = Bencode.read_file('corrupted.torrent')
except BencodeDecodingError:
    print("Torrent file is corrupted or not a valid torrent")
except FileNotFoundError:
    print("Torrent file not found")

Bencode Format Details

The bencode format uses the following encoding rules:

  • Strings: <length>:<content> (e.g., 4:spam for "spam")
  • Integers: i<number>e (e.g., i42e for 42)
  • Lists: l<contents>e (e.g., l4:spam4:eggse for ['spam', 'eggs'])
  • Dictionaries: d<contents>e with keys sorted (e.g., d3:key5:valuee)

The implementation handles all edge cases including:

  • Empty strings and containers
  • Negative integers
  • Binary data mixed with text
  • Nested structures of arbitrary depth
  • UTF-8 encoding/decoding with fallback for malformed data

Install with Tessl CLI

npx tessl i tessl/pypi-torrentool

docs

bencoding.md

index.md

torrent-operations.md

utilities.md

tile.json