0
# Diff Calculation
1
2
Comprehensive difference calculation between datasets, supporting hierarchical data structures, customizable comparison logic, and detailed change tracking with multiple output formats.
3
4
## Capabilities
5
6
### Diff Generation
7
8
Primary methods for calculating differences between two Adapter instances.
9
10
```python { .api }
11
def diff_from(self, source: "Adapter", diff_class: Type[Diff] = Diff,
12
flags: DiffSyncFlags = DiffSyncFlags.NONE,
13
callback: Optional[Callable[[str, int, int], None]] = None) -> Diff:
14
"""
15
Generate a Diff describing the difference from the other DiffSync to this one.
16
17
Args:
18
source: Object to diff against
19
diff_class: Diff or subclass thereof to use for diff calculation and storage
20
flags: Flags influencing the behavior of this diff operation
21
callback: Function with parameters (stage, current, total), called at intervals as calculation proceeds
22
23
Returns:
24
Diff object containing all differences found
25
"""
26
```
27
28
```python { .api }
29
def diff_to(self, target: "Adapter", diff_class: Type[Diff] = Diff,
30
flags: DiffSyncFlags = DiffSyncFlags.NONE,
31
callback: Optional[Callable[[str, int, int], None]] = None) -> Diff:
32
"""
33
Generate a Diff describing the difference from this DiffSync to another one.
34
35
Args:
36
target: Object to diff against
37
diff_class: Diff or subclass thereof to use for diff calculation and storage
38
flags: Flags influencing the behavior of this diff operation
39
callback: Function with parameters (stage, current, total), called at intervals as calculation proceeds
40
41
Returns:
42
Diff object containing all differences found
43
"""
44
```
45
46
#### Basic Diff Example
47
48
```python
49
from diffsync import Adapter, DiffSyncModel
50
51
# Create two adapters with different data
52
source = NetworkAdapter(name="source")
53
target = NetworkAdapter(name="target")
54
55
# Load data into both adapters
56
source.load()
57
target.load()
58
59
# Calculate differences - what changes would make target look like source
60
diff = target.diff_from(source)
61
62
# Print summary
63
print(f"Found {len(diff)} differences")
64
print(diff.str())
65
66
# Get detailed summary
67
summary = diff.summary()
68
print(f"Create: {summary['create']}")
69
print(f"Update: {summary['update']}")
70
print(f"Delete: {summary['delete']}")
71
```
72
73
### Diff Object
74
75
Container for storing and organizing differences between datasets.
76
77
```python { .api }
78
class Diff:
79
"""Diff Object, designed to store multiple DiffElement object and organize them in a group."""
80
81
def __init__(self) -> None:
82
"""Initialize a new, empty Diff object."""
83
84
children: OrderedDefaultDict[str, Dict[str, DiffElement]]
85
models_processed: int
86
```
87
88
```python { .api }
89
def add(self, element: "DiffElement") -> None:
90
"""
91
Add a new DiffElement to the changeset of this Diff.
92
93
Raises:
94
ObjectAlreadyExists: if an element of the same type and same name is already stored
95
"""
96
```
97
98
```python { .api }
99
def has_diffs(self) -> bool:
100
"""
101
Indicate if at least one of the child elements contains some diff.
102
103
Returns:
104
True if at least one child element contains some diff
105
"""
106
```
107
108
```python { .api }
109
def summary(self) -> Dict[str, int]:
110
"""Build a dict summary of this Diff and its child DiffElements."""
111
```
112
113
```python { .api }
114
def groups(self) -> List[str]:
115
"""Get the list of all group keys in self.children."""
116
```
117
118
```python { .api }
119
def get_children(self) -> Iterator["DiffElement"]:
120
"""
121
Iterate over all child elements in all groups in self.children.
122
123
For each group of children, check if an order method is defined,
124
Otherwise use the default method.
125
"""
126
```
127
128
```python { .api }
129
def complete(self) -> None:
130
"""
131
Method to call when this Diff has been fully populated with data and is "complete".
132
133
The default implementation does nothing, but a subclass could use this,
134
for example, to save the completed Diff to a file or database record.
135
"""
136
```
137
138
### DiffElement
139
140
Individual difference item representing a single object that may or may not have changes.
141
142
```python { .api }
143
class DiffElement:
144
"""DiffElement object, designed to represent a single item/object that may or may not have any diffs."""
145
146
def __init__(self, obj_type: str, name: str, keys: Dict,
147
source_name: str = "source", dest_name: str = "dest",
148
diff_class: Type[Diff] = Diff):
149
"""
150
Instantiate a DiffElement.
151
152
Args:
153
obj_type: Name of the object type being described, as in DiffSyncModel.get_type()
154
name: Human-readable name of the object being described, as in DiffSyncModel.get_shortname()
155
keys: Primary keys and values uniquely describing this object, as in DiffSyncModel.get_identifiers()
156
source_name: Name of the source DiffSync object
157
dest_name: Name of the destination DiffSync object
158
diff_class: Diff or subclass thereof to use to calculate the diffs to use for synchronization
159
"""
160
161
type: str
162
name: str
163
keys: Dict
164
source_name: str
165
dest_name: str
166
source_attrs: Optional[Dict]
167
dest_attrs: Optional[Dict]
168
child_diff: Diff
169
```
170
171
```python { .api }
172
@property
173
def action(self) -> Optional[str]:
174
"""
175
Action, if any, that should be taken to remediate the diffs described by this element.
176
177
Returns:
178
"create", "update", "delete", or None
179
"""
180
```
181
182
```python { .api }
183
def add_attrs(self, source: Optional[Dict] = None, dest: Optional[Dict] = None) -> None:
184
"""Set additional attributes of a source and/or destination item that may result in diffs."""
185
```
186
187
```python { .api }
188
def get_attrs_keys(self) -> Iterable[str]:
189
"""
190
Get the list of shared attrs between source and dest, or the attrs of source or dest if only one is present.
191
192
Returns:
193
- If source_attrs is not set, return the keys of dest_attrs
194
- If dest_attrs is not set, return the keys of source_attrs
195
- If both are defined, return the intersection of both keys
196
"""
197
```
198
199
```python { .api }
200
def get_attrs_diffs(self) -> Dict[str, Dict[str, Any]]:
201
"""
202
Get the dict of actual attribute diffs between source_attrs and dest_attrs.
203
204
Returns:
205
Dictionary of the form {"−": {key1: <value>, key2: ...}, "+": {key1: <value>, key2: ...}},
206
where the "−" or "+" dicts may be absent
207
"""
208
```
209
210
```python { .api }
211
def has_diffs(self, include_children: bool = True) -> bool:
212
"""
213
Check whether this element (or optionally any of its children) has some diffs.
214
215
Args:
216
include_children: If True, recursively check children for diffs as well
217
"""
218
```
219
220
```python { .api }
221
def add_child(self, element: "DiffElement") -> None:
222
"""
223
Attach a child object of type DiffElement.
224
225
Childs are saved in a Diff object and are organized by type and name.
226
"""
227
```
228
229
```python { .api }
230
def get_children(self) -> Iterator["DiffElement"]:
231
"""Iterate over all child DiffElements of this one."""
232
```
233
234
#### DiffElement Usage Example
235
236
```python
237
# Examine individual diff elements
238
for element in diff.get_children():
239
print(f"Element: {element.type} - {element.name}")
240
print(f"Action: {element.action}")
241
242
if element.action == "update":
243
attrs_diff = element.get_attrs_diffs()
244
if "+" in attrs_diff:
245
print(f"New values: {attrs_diff['+']}")
246
if "-" in attrs_diff:
247
print(f"Old values: {attrs_diff['-']}")
248
249
# Check for child differences
250
if element.has_diffs(include_children=True):
251
print("Has child differences")
252
for child in element.get_children():
253
print(f" Child: {child.type} - {child.name} ({child.action})")
254
```
255
256
### Diff Serialization and Display
257
258
Methods for converting diff objects to various output formats.
259
260
```python { .api }
261
def str(self, indent: int = 0) -> str:
262
"""Build a detailed string representation of this Diff and its child DiffElements."""
263
```
264
265
```python { .api }
266
def dict(self) -> Dict[str, Dict[str, Dict]]:
267
"""Build a dictionary representation of this Diff."""
268
```
269
270
#### Diff Output Example
271
272
```python
273
# String representation - human readable
274
print(diff.str())
275
276
# Dictionary representation - programmatic access
277
diff_data = diff.dict()
278
for model_type, objects in diff_data.items():
279
print(f"Model type: {model_type}")
280
for obj_name, changes in objects.items():
281
print(f" Object: {obj_name}")
282
if "+" in changes:
283
print(f" Added: {changes['+']}")
284
if "-" in changes:
285
print(f" Removed: {changes['-']}")
286
```
287
288
### Advanced Diff Features
289
290
#### Custom Diff Classes
291
292
```python
293
class CustomDiff(Diff):
294
def complete(self):
295
# Save diff to file when complete
296
with open(f"diff_{datetime.now().isoformat()}.json", "w") as f:
297
json.dump(self.dict(), f, indent=2)
298
299
def order_children_device(self, children):
300
# Custom ordering for device objects
301
return sorted(children.values(), key=lambda x: x.name)
302
303
# Use custom diff class
304
diff = target.diff_from(source, diff_class=CustomDiff)
305
```
306
307
#### Progress Callbacks
308
309
```python
310
def progress_callback(stage, current, total):
311
percentage = (current / total) * 100 if total > 0 else 0
312
print(f"{stage}: {current}/{total} ({percentage:.1f}%)")
313
314
# Monitor diff calculation progress
315
diff = target.diff_from(source, callback=progress_callback)
316
```
317
318
#### Filtering with Flags
319
320
```python
321
from diffsync import DiffSyncFlags
322
323
# Skip objects that only exist in source
324
diff = target.diff_from(source, flags=DiffSyncFlags.SKIP_UNMATCHED_SRC)
325
326
# Skip objects that only exist in target
327
diff = target.diff_from(source, flags=DiffSyncFlags.SKIP_UNMATCHED_DST)
328
329
# Skip objects that only exist in either source or target
330
diff = target.diff_from(source, flags=DiffSyncFlags.SKIP_UNMATCHED_BOTH)
331
```
332
333
## Utility Functions
334
335
Helper functions used internally by the diff calculation engine, also available for advanced usage scenarios.
336
337
```python { .api }
338
def intersection(lst1: List[T], lst2: List[T]) -> List[T]:
339
"""
340
Calculate the intersection of two lists, with ordering based on the first list.
341
342
Args:
343
lst1: First list (determines ordering)
344
lst2: Second list
345
346
Returns:
347
List containing elements common to both lists, in lst1 order
348
"""
349
```
350
351
```python { .api }
352
def symmetric_difference(lst1: List[T], lst2: List[T]) -> List[T]:
353
"""
354
Calculate the symmetric difference of two lists.
355
356
Args:
357
lst1: First list
358
lst2: Second list
359
360
Returns:
361
Sorted list containing elements that exist in either list but not both
362
"""
363
```
364
365
```python { .api }
366
class OrderedDefaultDict(OrderedDict, Generic[K, V]):
367
"""A combination of collections.OrderedDict and collections.DefaultDict behavior."""
368
369
def __init__(self, dict_type: Callable[[], V]) -> None:
370
"""
371
Create a new OrderedDefaultDict.
372
373
Args:
374
dict_type: Factory function to create default values for missing keys
375
"""
376
377
def __missing__(self, key: K) -> V:
378
"""When trying to access a nonexistent key, initialize the key value based on the internal factory."""
379
```
380
381
### Utility Usage Examples
382
383
```python
384
from diffsync.utils import intersection, symmetric_difference, OrderedDefaultDict
385
386
# Find common model types between two adapters
387
common_types = intersection(adapter1.top_level, adapter2.top_level)
388
print(f"Common model types: {common_types}")
389
390
# Find model types that exist in only one adapter
391
unique_types = symmetric_difference(adapter1.top_level, adapter2.top_level)
392
print(f"Unique model types: {unique_types}")
393
394
# Create an ordered dictionary with default factory
395
diff_data = OrderedDefaultDict(dict)
396
diff_data["device"]["router1"] = {"action": "create"}
397
diff_data["interface"]["eth0"] = {"action": "update"}
398
print(diff_data) # Maintains insertion order with auto-initialization
399
```
400
401
## Types
402
403
```python { .api }
404
from typing import Any, Dict, Iterator, List, Optional, Callable, Type, TypeVar, Generic
405
from collections import OrderedDict
406
from diffsync.utils import OrderedDefaultDict
407
408
# Type variables for utility functions
409
T = TypeVar("T")
410
K = TypeVar("K")
411
V = TypeVar("V")
412
413
# Callback function type for progress monitoring
414
ProgressCallback = Callable[[str, int, int], None]
415
```