or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-pyfury

Blazingly fast multi-language serialization framework powered by JIT and zero-copy

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/pyfury@0.10.x

To install, run

npx @tessl/cli install tessl/pypi-pyfury@0.10.0

0

# PyFury

1

2

PyFury is the Python implementation of Apache Fury, a blazingly fast multi-language serialization framework powered by JIT compilation and zero-copy techniques. PyFury provides high-performance serialization for Python objects with support for cross-language compatibility, reference tracking, and row format operations.

3

4

## Package Information

5

6

- **Package Name**: pyfury

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install pyfury`

10

11

## Core Imports

12

13

```python

14

import pyfury

15

from pyfury import Fury, Language

16

```

17

18

For row format operations:

19

20

```python

21

from pyfury.format import encoder, RowData

22

```

23

24

## Basic Usage

25

26

```python

27

import pyfury

28

29

# Create Fury instance

30

fury = pyfury.Fury(ref_tracking=True)

31

32

# Register classes for cross-language serialization

33

fury.register_class(SomeClass, type_tag="example.SomeClass")

34

35

# Serialize object

36

obj = SomeClass()

37

bytes_data = fury.serialize(obj)

38

39

# Deserialize object

40

restored_obj = fury.deserialize(bytes_data)

41

42

# Cross-language serialization

43

xlang_fury = pyfury.Fury(language=pyfury.Language.XLANG, ref_tracking=True)

44

xlang_fury.register_class(SomeClass, type_tag="example.SomeClass")

45

xlang_bytes = xlang_fury.serialize(obj)

46

47

# Row format encoding for zero-copy operations

48

from dataclasses import dataclass

49

50

@dataclass

51

class DataObject:

52

field1: int

53

field2: str

54

55

row_encoder = pyfury.encoder(DataObject)

56

data_obj = DataObject(field1=42, field2="hello")

57

row_data = row_encoder.to_row(data_obj)

58

```

59

60

## Architecture

61

62

PyFury is built around several key components:

63

64

- **Fury Engine**: Core Python serialization engine with configurable language modes

65

- **Language Support**: Python-native and cross-language (XLANG) serialization modes

66

- **Reference Tracking**: Optional circular reference and shared object support

67

- **Class Registration**: Security-focused type system with allowlists

68

- **Serializer Framework**: Extensible system for custom types and built-in Python types

69

- **Row Format**: Zero-copy columnar data format with Arrow integration

70

- **Buffer Management**: Efficient binary I/O with memory buffer pooling

71

- **Meta Strings**: Optimized string encoding and meta compression

72

73

## Capabilities

74

75

### Core Serialization

76

77

Primary serialization operations for converting Python objects to/from binary format with optional reference tracking and circular reference support.

78

79

```python { .api }

80

class Fury:

81

def __init__(

82

self,

83

language: Language = Language.XLANG,

84

ref_tracking: bool = False,

85

require_class_registration: bool = True,

86

): ...

87

88

def serialize(

89

self,

90

obj,

91

buffer: Buffer = None,

92

buffer_callback=None,

93

unsupported_callback=None,

94

) -> Union[Buffer, bytes]: ...

95

96

def deserialize(

97

self,

98

buffer: Union[Buffer, bytes],

99

buffers: Iterable = None,

100

unsupported_objects: Iterable = None,

101

): ...

102

```

103

104

[Core Serialization](./core-serialization.md)

105

106

### Class Registration and Type System

107

108

Type registration system for security and cross-language compatibility with support for custom type tags and class IDs.

109

110

```python { .api }

111

class Fury:

112

def register_class(

113

self,

114

cls,

115

*,

116

class_id: int = None,

117

type_tag: str = None

118

): ...

119

120

def register_serializer(self, cls: type, serializer): ...

121

122

class Language(enum.Enum):

123

XLANG = 0

124

JAVA = 1

125

126

class OpaqueObject:

127

def __init__(self, type_id: int, data: bytes): ...

128

```

129

130

[Type System](./type-system.md)

131

132

### Row Format and Arrow Integration

133

134

Zero-copy row format encoding with PyArrow integration for efficient columnar data operations and cross-language data exchange.

135

136

```python { .api }

137

@dataclass

138

class RowData:

139

def __init__(self, schema, data: bytes): ...

140

141

def encoder(cls_or_schema):

142

"""Create row encoder for a dataclass or Arrow schema."""

143

144

class Encoder:

145

def to_row(self, obj) -> RowData: ...

146

def from_row(self, row_data: RowData): ...

147

@property

148

def schema(self): ...

149

150

class ArrowWriter:

151

def write(self, obj): ...

152

```

153

154

[Row Format](./row-format.md)

155

156

### Serializer Framework

157

158

Extensible serializer system for handling custom types, built-in Python types, and cross-language compatible serialization.

159

160

```python { .api }

161

class Serializer:

162

def __init__(self, fury, cls): ...

163

def write(self, buffer, value): ...

164

def read(self, buffer): ...

165

166

class CrossLanguageCompatibleSerializer(Serializer):

167

"""Base class for serializers that support cross-language serialization."""

168

169

class BufferObject:

170

"""Interface for objects that can provide buffer data."""

171

def to_buffer(self) -> bytes: ...

172

```

173

174

[Custom Serializers](./custom-serializers.md)

175

176

### Buffer and Memory Management

177

178

High-performance binary buffer operations with efficient memory allocation and platform-specific optimizations.

179

180

```python { .api }

181

class Buffer:

182

@staticmethod

183

def allocate(size: int) -> Buffer: ...

184

185

def write_byte(self, value: int): ...

186

def read_byte(self) -> int: ...

187

def write_int32(self, value: int): ...

188

def read_int32(self) -> int: ...

189

def write_int64(self, value: int): ...

190

def read_int64(self) -> int: ...

191

def write_bytes(self, data: bytes): ...

192

def read_bytes(self, length: int) -> bytes: ...

193

```

194

195

[Memory Management](./memory-management.md)

196

197

198

## Exception Handling

199

200

```python { .api }

201

class FuryError(Exception):

202

"""Base exception for PyFury operations."""

203

204

class ClassNotCompatibleError(FuryError):

205

"""Raised when class compatibility checks fail."""

206

207

class CompileError(FuryError):

208

"""Raised when code generation/compilation fails."""

209

```

210

211

## Performance Considerations

212

213

- **Reference Tracking**: Enable only when dealing with circular references or shared objects

214

- **Class Registration**: Required by default for security; impacts initial setup time but improves runtime performance

215

- **Language Mode**: Use `Language.XLANG` for cross-language compatibility, Python mode for pure Python scenarios

216

- **Buffer Reuse**: Reuse Buffer instances across serialization operations for optimal performance

217

- **Row Format**: Use for zero-copy operations and efficient columnar data processing

218

- **Serializer Registration**: Pre-register custom serializers to avoid runtime overhead

219

220

## Security Considerations

221

222

PyFury enables class registration by default to prevent deserialization of untrusted classes. When `require_class_registration=False`, PyFury can deserialize arbitrary Python objects, which may execute malicious code through `__init__`, `__new__`, `__eq__`, or `__hash__` methods. Only disable class registration in trusted environments.