or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-management.mddiff-calculation.mdflags-configuration.mdindex.mdmodel-definition.mdstorage-backends.mdsynchronization.md

index.mddocs/

0

# DiffSync

1

2

DiffSync is a Python utility library designed to compare and synchronize different datasets. It serves as an intermediate translation layer between multiple data sources, enabling developers to define data models and adapters to translate between each base data source and a unified data model. The library excels in scenarios requiring repeated synchronization as data changes over time, accounting for creation, modification, and deletion of records, especially when data forms hierarchical relationships.

3

4

## Package Information

5

6

- **Package Name**: diffsync

7

- **Language**: Python (>=3.9,<4.0)

8

- **Installation**: `pip install diffsync`

9

- **Optional Redis Support**: `pip install diffsync[redis]`

10

11

## Core Imports

12

13

```python

14

import diffsync

15

```

16

17

For main classes:

18

19

```python

20

from diffsync import DiffSyncModel, Adapter

21

```

22

23

For complete API access:

24

25

```python

26

from diffsync import (

27

DiffSyncModel, Adapter, Diff,

28

DiffSyncFlags, DiffSyncModelFlags, DiffSyncStatus,

29

LocalStore, BaseStore,

30

# Exceptions

31

ObjectAlreadyExists, ObjectNotFound, ObjectStoreWrongType,

32

DiffClassMismatch

33

)

34

from diffsync.diff import DiffElement

35

from diffsync.store.redis import RedisStore

36

from diffsync.exceptions import (

37

ObjectNotCreated, ObjectNotUpdated, ObjectNotDeleted

38

)

39

from diffsync.enum import DiffSyncActions

40

```

41

42

## Basic Usage

43

44

```python

45

from diffsync import DiffSyncModel, Adapter

46

47

# Define a data model

48

class Device(DiffSyncModel):

49

_modelname = "device"

50

_identifiers = ("name",)

51

_attributes = ("os_version", "vendor")

52

53

name: str

54

os_version: str

55

vendor: str

56

57

# Create adapters for different data sources

58

class NetworkAdapter(Adapter):

59

device = Device

60

top_level = ["device"]

61

62

def load(self):

63

# Load data from your source (database, API, etc.)

64

device1 = Device(name="router1", os_version="15.1", vendor="cisco")

65

device2 = Device(name="switch1", os_version="12.2", vendor="juniper")

66

self.add(device1)

67

self.add(device2)

68

69

# Create two adapters with different data

70

source = NetworkAdapter(name="source")

71

target = NetworkAdapter(name="target")

72

73

# Load their respective data

74

source.load()

75

target.load()

76

77

# Calculate differences

78

diff = target.diff_from(source)

79

print(diff.str())

80

81

# Synchronize data from source to target

82

sync_diff = target.sync_from(source)

83

```

84

85

## Architecture

86

87

DiffSync uses a hierarchical model-based approach with several key components:

88

89

- **DiffSyncModel**: Base class for defining data models with identifiers, attributes, and child relationships

90

- **Adapter**: Container for managing collections of DiffSyncModel instances and performing diff/sync operations

91

- **Store Backends**: Storage implementations (LocalStore for in-memory, RedisStore for persistent storage)

92

- **Diff Objects**: Structured representations of differences between datasets

93

- **Sync Operations**: Automated creation, update, and deletion of records based on calculated diffs

94

95

This design enables systematic comparison and synchronization of complex, hierarchical data structures between disparate systems while maintaining data integrity and providing detailed change tracking.

96

97

## Capabilities

98

99

### Model Definition

100

101

Core functionality for defining data models that represent your domain objects. Models specify unique identifiers, trackable attributes, and parent-child relationships between different object types.

102

103

```python { .api }

104

class DiffSyncModel(BaseModel):

105

_modelname: ClassVar[str]

106

_identifiers: ClassVar[Tuple[str, ...]]

107

_attributes: ClassVar[Tuple[str, ...]]

108

_children: ClassVar[Dict[str, str]]

109

model_flags: DiffSyncModelFlags

110

adapter: Optional["Adapter"]

111

```

112

113

[Model Definition](./model-definition.md)

114

115

### Data Management

116

117

Adapter functionality for managing collections of models, loading data from various sources, and providing query and storage operations through configurable storage backends.

118

119

```python { .api }

120

class Adapter:

121

top_level: ClassVar[List[str]]

122

123

def __init__(self, name: Optional[str] = None,

124

internal_storage_engine: Union[Type[BaseStore], BaseStore] = LocalStore): ...

125

def load(self): ...

126

def add(self, obj: DiffSyncModel): ...

127

def get(self, obj: Union[str, DiffSyncModel, Type[DiffSyncModel]],

128

identifier: Union[str, Dict]) -> DiffSyncModel: ...

129

def get_all(self, obj: Union[str, DiffSyncModel, Type[DiffSyncModel]]) -> List[DiffSyncModel]: ...

130

```

131

132

[Data Management](./data-management.md)

133

134

### Diff Calculation

135

136

Comprehensive difference calculation between datasets, supporting hierarchical data structures, customizable comparison logic, and detailed change tracking with multiple output formats.

137

138

```python { .api }

139

def diff_from(self, source: "Adapter", diff_class: Type[Diff] = Diff,

140

flags: DiffSyncFlags = DiffSyncFlags.NONE,

141

callback: Optional[Callable[[str, int, int], None]] = None) -> Diff: ...

142

143

class Diff:

144

def __init__(self): ...

145

def add(self, element: "DiffElement"): ...

146

def has_diffs(self) -> bool: ...

147

def summary(self) -> Dict[str, int]: ...

148

```

149

150

[Diff Calculation](./diff-calculation.md)

151

152

### Synchronization

153

154

Automated synchronization operations that apply calculated differences to update target datasets. Supports creation, modification, and deletion of records with comprehensive error handling and status tracking.

155

156

```python { .api }

157

def sync_from(self, source: "Adapter", diff_class: Type[Diff] = Diff,

158

flags: DiffSyncFlags = DiffSyncFlags.NONE,

159

callback: Optional[Callable[[str, int, int], None]] = None,

160

diff: Optional[Diff] = None) -> Diff: ...

161

162

def sync_to(self, target: "Adapter", diff_class: Type[Diff] = Diff,

163

flags: DiffSyncFlags = DiffSyncFlags.NONE,

164

callback: Optional[Callable[[str, int, int], None]] = None,

165

diff: Optional[Diff] = None) -> Diff: ...

166

```

167

168

[Synchronization](./synchronization.md)

169

170

### Storage Backends

171

172

Pluggable storage backend implementations for different persistence requirements, from in-memory storage for temporary operations to Redis-based storage for distributed scenarios.

173

174

```python { .api }

175

class BaseStore:

176

def get(self, *, model: Union[str, "DiffSyncModel", Type["DiffSyncModel"]],

177

identifier: Union[str, Dict]) -> "DiffSyncModel": ...

178

def add(self, *, obj: "DiffSyncModel"): ...

179

def remove(self, *, obj: "DiffSyncModel", remove_children: bool = False): ...

180

181

class LocalStore(BaseStore): ...

182

class RedisStore(BaseStore): ...

183

```

184

185

[Storage Backends](./storage-backends.md)

186

187

### Flags and Configuration

188

189

Behavioral control flags and configuration options for customizing diff calculation and synchronization behavior, including error handling, skipping patterns, and logging verbosity.

190

191

```python { .api }

192

class DiffSyncFlags(enum.Flag):

193

NONE = 0

194

CONTINUE_ON_FAILURE = 0b1

195

SKIP_UNMATCHED_SRC = 0b10

196

SKIP_UNMATCHED_DST = 0b100

197

LOG_UNCHANGED_RECORDS = 0b1000

198

199

class DiffSyncModelFlags(enum.Flag):

200

NONE = 0

201

IGNORE = 0b1

202

SKIP_CHILDREN_ON_DELETE = 0b10

203

SKIP_UNMATCHED_SRC = 0b100

204

SKIP_UNMATCHED_DST = 0b1000

205

NATURAL_DELETION_ORDER = 0b10000

206

```

207

208

[Flags and Configuration](./flags-configuration.md)