or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

bencoding.mdindex.mdtorrent-operations.mdutilities.md

bencoding.mddocs/

0

# Bencoding

1

2

Complete bencoding implementation for encoding and decoding BitTorrent data structures. Bencode is the encoding format used by the BitTorrent protocol for storing and transmitting structured data, supporting strings, integers, lists, and dictionaries.

3

4

## Capabilities

5

6

### Encoding to Bencode

7

8

Convert Python objects to bencoded byte strings.

9

10

```python { .api }

11

@classmethod

12

def encode(cls, value: TypeEncodable) -> bytes:

13

"""

14

Encode Python object to bencoded bytes.

15

16

Supports encoding of strings, integers, lists, tuples, sets, dictionaries,

17

bytes, and bytearrays. Dictionaries are automatically sorted by key as

18

required by bencode specification.

19

20

Parameters:

21

- value (TypeEncodable): Python object to encode

22

Union[str, int, list, set, tuple, dict, bytes, bytearray]

23

24

Returns:

25

bytes: Bencoded data ready for storage or transmission

26

27

Raises:

28

BencodeEncodingError: If value type cannot be encoded

29

"""

30

```

31

32

### Decoding from Bencode

33

34

Parse bencoded data back into Python objects.

35

36

```python { .api }

37

@classmethod

38

def decode(cls, encoded: bytes, *, byte_keys: Set[str] = None) -> TypeEncodable:

39

"""

40

Decode bencoded bytes to Python objects.

41

42

Automatically reconstructs the original data structure from bencoded format.

43

Handles special case of binary data that should remain as bytes rather than

44

being decoded as UTF-8 strings.

45

46

Parameters:

47

- encoded (bytes): Bencoded data to decode

48

- byte_keys (Set[str], optional): Keys whose values should remain as bytes

49

instead of being decoded as UTF-8 strings.

50

Commonly used for 'pieces' field in torrents.

51

52

Returns:

53

TypeEncodable: Decoded Python object (dict, list, str, int, or bytes)

54

55

Raises:

56

BencodeDecodingError: If data is malformed or cannot be parsed

57

"""

58

```

59

60

### String Decoding

61

62

Decode bencoded strings directly.

63

64

```python { .api }

65

@classmethod

66

def read_string(cls, string: Union[str, bytes], *, byte_keys: Set[str] = None) -> TypeEncodable:

67

"""

68

Decode bencoded string or byte string.

69

70

Convenience method for decoding bencoded data provided as string.

71

Automatically converts string to bytes before decoding.

72

73

Parameters:

74

- string (Union[str, bytes]): Bencoded data as string or bytes

75

- byte_keys (Set[str], optional): Keys to keep as bytes rather than decode as UTF-8

76

77

Returns:

78

TypeEncodable: Decoded Python object

79

80

Raises:

81

BencodeDecodingError: If string is malformed

82

"""

83

```

84

85

### File Decoding

86

87

Decode bencoded files directly from disk.

88

89

```python { .api }

90

@classmethod

91

def read_file(cls, filepath: Union[str, Path], *, byte_keys: Set[str] = None) -> TypeEncodable:

92

"""

93

Decode bencoded data from file.

94

95

Reads entire file into memory and decodes the bencoded content.

96

Commonly used for reading .torrent files.

97

98

Parameters:

99

- filepath (Union[str, Path]): Path to file containing bencoded data

100

- byte_keys (Set[str], optional): Keys to preserve as bytes

101

102

Returns:

103

TypeEncodable: Decoded file contents as Python objects

104

105

Raises:

106

BencodeDecodingError: If file contains malformed bencoded data

107

FileNotFoundError: If file does not exist

108

"""

109

```

110

111

## Usage Examples

112

113

### Basic Encoding and Decoding

114

115

```python

116

from torrentool.bencode import Bencode

117

118

# Encode various Python objects

119

data = {

120

'announce': 'http://tracker.example.com/announce',

121

'info': {

122

'name': 'example.txt',

123

'length': 12345,

124

'pieces': b'\x01\x02\x03\x04\x05' # Binary hash data

125

},

126

'trackers': ['http://t1.com', 'http://t2.com'],

127

'created': 1234567890

128

}

129

130

# Encode to bencoded bytes

131

encoded = Bencode.encode(data)

132

print(f"Encoded size: {len(encoded)} bytes")

133

134

# Decode back to Python objects

135

# Specify 'pieces' as byte key to prevent UTF-8 decoding

136

decoded = Bencode.decode(encoded, byte_keys={'pieces'})

137

print(f"Decoded: {decoded}")

138

139

# Verify round-trip

140

assert decoded == data

141

```

142

143

### Working with Torrent Files

144

145

```python

146

from torrentool.bencode import Bencode

147

from pathlib import Path

148

149

# Read a .torrent file

150

torrent_path = Path('example.torrent')

151

torrent_data = Bencode.read_file(torrent_path, byte_keys={'pieces'})

152

153

print(f"Torrent name: {torrent_data['info']['name']}")

154

print(f"Announce URL: {torrent_data['announce']}")

155

print(f"Piece length: {torrent_data['info']['piece length']}")

156

157

# Modify and save back

158

torrent_data['comment'] = 'Modified by Python script'

159

encoded_data = Bencode.encode(torrent_data)

160

161

with open('modified.torrent', 'wb') as f:

162

f.write(encoded_data)

163

```

164

165

### String and Bytes Handling

166

167

```python

168

from torrentool.bencode import Bencode

169

170

# Working with strings vs bytes

171

string_data = "Hello, world!"

172

bytes_data = b"Binary data \x00\x01\x02"

173

174

# Both can be encoded

175

encoded_string = Bencode.encode(string_data)

176

encoded_bytes = Bencode.encode(bytes_data)

177

178

# Decode - strings come back as strings, bytes as bytes

179

decoded_string = Bencode.decode(encoded_string) # Returns str

180

decoded_bytes = Bencode.decode(encoded_bytes) # Returns bytes

181

182

print(f"String: {decoded_string}")

183

print(f"Bytes: {decoded_bytes}")

184

185

# Complex structure with mixed types

186

mixed_data = {

187

'text': 'This is text',

188

'binary': b'\x89PNG\r\n\x1a\n', # PNG header

189

'number': 42,

190

'list': ['item1', 'item2', b'binary_item']

191

}

192

193

encoded_mixed = Bencode.encode(mixed_data)

194

decoded_mixed = Bencode.decode(encoded_mixed)

195

```

196

197

### Error Handling

198

199

```python

200

from torrentool.bencode import Bencode, BencodeDecodingError, BencodeEncodingError

201

202

# Handle encoding errors

203

try:

204

invalid_data = object() # Objects cannot be encoded

205

Bencode.encode(invalid_data)

206

except BencodeEncodingError as e:

207

print(f"Encoding failed: {e}")

208

209

# Handle decoding errors

210

try:

211

malformed_data = b"invalid bencode data"

212

Bencode.decode(malformed_data)

213

except BencodeDecodingError as e:

214

print(f"Decoding failed: {e}")

215

216

# Graceful handling of corrupted files

217

try:

218

corrupted_torrent = Bencode.read_file('corrupted.torrent')

219

except BencodeDecodingError:

220

print("Torrent file is corrupted or not a valid torrent")

221

except FileNotFoundError:

222

print("Torrent file not found")

223

```

224

225

## Bencode Format Details

226

227

The bencode format uses the following encoding rules:

228

229

- **Strings**: `<length>:<content>` (e.g., `4:spam` for "spam")

230

- **Integers**: `i<number>e` (e.g., `i42e` for 42)

231

- **Lists**: `l<contents>e` (e.g., `l4:spam4:eggse` for ['spam', 'eggs'])

232

- **Dictionaries**: `d<contents>e` with keys sorted (e.g., `d3:key5:valuee`)

233

234

The implementation handles all edge cases including:

235

- Empty strings and containers

236

- Negative integers

237

- Binary data mixed with text

238

- Nested structures of arbitrary depth

239

- UTF-8 encoding/decoding with fallback for malformed data