CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-jsondiff

Diff JSON and JSON-like structures in Python with multiple syntax support and bidirectional patching

Pending
Overview
Eval results
Files

core-operations.mddocs/

Core Operations

Primary functions for computing differences and measuring similarity between JSON structures. These form the main public API of the jsondiff library and provide convenient access to the underlying JsonDiffer functionality.

Capabilities

Difference Computation

Computes the difference between two JSON structures using a specified JsonDiffer class and returns the diff in the configured format.

def diff(a, b, fp=None, cls=JsonDiffer, **kwargs):
    """
    Computes the difference between two JSON structures.
    
    Parameters:
    - a: The original JSON structure (dict, list, set, tuple, or primitive)
    - b: The modified JSON structure (dict, list, set, tuple, or primitive)
    - fp: Optional file pointer to dump the diff to
    - cls: The JsonDiffer class or subclass to use (default: JsonDiffer)
    - **kwargs: Additional keyword arguments for JsonDiffer constructor
      - syntax: 'compact', 'symmetric', 'explicit', 'rightonly' (default: 'compact')
      - load: bool, auto-load JSON from strings/files (default: False)
      - dump: bool, auto-dump output to JSON (default: False)
      - marshal: bool, marshal diffs for safe serialization (default: False)
    
    Returns:
    dict: The computed diff structure
    
    Note: For exclude_paths functionality, use JsonDiffer.diff() method directly
    """

Usage Examples:

from jsondiff import diff, JsonDiffer
from jsondiff.symbols import delete, insert

# Dictionary differences
original = {'name': 'Alice', 'age': 30, 'city': 'NYC'}
modified = {'name': 'Alice', 'age': 31, 'city': 'Boston', 'country': 'USA'}

result = diff(original, modified)
# Result: {'age': 31, 'city': 'Boston', 'country': 'USA'}

# With explicit syntax for clarity
result = diff(original, modified, syntax='explicit')
# Result: {insert: {'country': 'USA'}, update: {'age': 31, 'city': 'Boston'}}

# List differences
list1 = ['apple', 'banana', 'cherry']
list2 = ['apple', 'blueberry', 'cherry', 'date']

result = diff(list1, list2)
# Result: {1: 'blueberry', insert: [(3, 'date')]}

# Set differences
set1 = {'red', 'green', 'blue'}
set2 = {'red', 'yellow', 'blue'}

result = diff(set1, set2)
# Result: {discard: {'green'}, add: {'yellow'}}

# Path exclusion (requires JsonDiffer class)
data1 = {'user': {'name': 'John', 'age': 25, 'temp_id': 123}}
data2 = {'user': {'name': 'John', 'age': 26, 'temp_id': 456}}

differ = JsonDiffer()
result = differ.diff(data1, data2, exclude_paths=['user.temp_id'])
# Result: {'user': {'age': 26}}

# Auto-loading from JSON strings
json1 = '{"a": 1, "b": 2}'
json2 = '{"a": 1, "b": 3, "c": 4}'

result = diff(json1, json2, load=True)
# Result: {'b': 3, 'c': 4}

Patch Application

Applies a diff to a JSON structure to produce the modified structure. Note: The patch function is not exported in the main module. Use JsonDiffer.patch() method instead.

Usage Examples:

from jsondiff import diff, JsonDiffer

# Basic patching using JsonDiffer
original = {'name': 'Alice', 'age': 30}
modified = {'name': 'Alice', 'age': 31, 'city': 'NYC'}

# Create diff and apply patch
diff_result = diff(original, modified)
differ = JsonDiffer()
patched = differ.patch(original, diff_result)
print(patched)  # {'name': 'Alice', 'age': 31, 'city': 'NYC'}

# List patching
list1 = ['a', 'b', 'c']
list2 = ['a', 'x', 'c', 'd']

diff_result = diff(list1, list2)
patched = differ.patch(list1, diff_result)
print(patched)  # ['a', 'x', 'c', 'd']

# Set patching
set1 = {1, 2, 3}
set2 = {1, 3, 4, 5}

diff_result = diff(set1, set2)
patched = differ.patch(set1, diff_result)
print(patched)  # {1, 3, 4, 5}

# Symmetric syntax with unpatch capability
original = {'x': 10, 'y': 20}
modified = {'x': 15, 'y': 20, 'z': 30}

differ = JsonDiffer(syntax='symmetric')
diff_result = differ.diff(original, modified)
patched = differ.patch(original, diff_result)
print(patched)  # {'x': 15, 'y': 20, 'z': 30}

Similarity Measurement

Calculates the similarity score between two JSON structures, returning a float value between 0.0 (completely different) and 1.0 (identical).

def similarity(a, b, cls=JsonDiffer, **kwargs):
    """
    Calculates the similarity score between two JSON structures.
    
    Parameters:
    - a: The first JSON structure
    - b: The second JSON structure  
    - cls: The JsonDiffer class or subclass to use (default: JsonDiffer)
    - **kwargs: Additional keyword arguments for JsonDiffer constructor
      - load: bool, auto-load JSON from strings/files (default: False)
    
    Returns:
    float: Similarity score between 0.0 and 1.0
    """

Usage Examples:

from jsondiff import similarity

# Identical structures
data1 = {'a': 1, 'b': 2}
data2 = {'a': 1, 'b': 2}
score = similarity(data1, data2)
print(score)  # 1.0

# Completely different structures  
data1 = {'a': 1, 'b': 2}
data2 = {'x': 10, 'y': 20}
score = similarity(data1, data2)
print(score)  # 0.0

# Partially similar structures
data1 = {'name': 'Alice', 'age': 30, 'city': 'NYC'}
data2 = {'name': 'Alice', 'age': 31, 'city': 'Boston'}
score = similarity(data1, data2)
print(score)  # ~0.75 (name matches, age/city partially similar)

# List similarity
list1 = ['a', 'b', 'c', 'd']
list2 = ['a', 'x', 'c', 'd']
score = similarity(list1, list2)
print(score)  # ~0.75 (3 out of 4 elements match positions)

# Set similarity
set1 = {1, 2, 3, 4, 5}
set2 = {1, 2, 3, 6, 7}
score = similarity(set1, set2)
print(score)  # 0.6 (3 common elements out of 5 total unique)

Error Handling

The core functions handle various error conditions:

from json import JSONDecodeError
from yaml import YAMLError

# Invalid JSON when load=True
try:
    result = diff('{"invalid": json}', '{"valid": "json"}', load=True)
except JSONDecodeError as e:
    print(f"Invalid JSON: {e}")

# File not found errors are handled by underlying loaders
# ValueError raised for invalid serialization formats

Performance Considerations

  • Large structures: The diff algorithm uses dynamic programming for list comparisons
  • Deep nesting: Performance may degrade with very deeply nested structures
  • Set operations: Set diffs are optimized using native set operations
  • Memory usage: Large diffs may consume significant memory, consider streaming for very large datasets

Install with Tessl CLI

npx tessl i tessl/pypi-jsondiff

docs

cli.md

core-operations.md

diff-syntaxes.md

index.md

jsondiff-class.md

serialization.md

tile.json