or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

document-management.mdidentifiers.mdindex.mdprov-elements.mdrelationships.mdserialization.mdvisualization.md
tile.json

tessl/pypi-prov

A library for W3C Provenance Data Model supporting PROV-JSON, PROV-XML and PROV-O (RDF)

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/prov@2.1.x

To install, run

npx @tessl/cli install tessl/pypi-prov@2.1.0

index.mddocs/

PROV

A comprehensive Python library for working with W3C Provenance Data Model (PROV-DM). PROV enables developers to create, manipulate, and exchange provenance information in various standardized formats including PROV-O (RDF), PROV-XML, PROV-JSON, and PROV-N. The library provides in-memory classes for PROV assertions, serialization/deserialization capabilities, graphical export functionality, and seamless integration with NetworkX for graph analysis.

Package Information

  • Package Name: prov
  • Language: Python
  • Installation: pip install prov
  • Optional Dependencies:
    • pip install prov[rdf] for RDF support
    • pip install prov[xml] for XML support

Core Imports

import prov
from prov.model import ProvDocument, ProvBundle, Namespace, Literal
from prov.identifier import QualifiedName, Identifier

For specific functionality:

from prov.graph import prov_to_graph, graph_to_prov
from prov.dot import prov_to_dot
from prov import serializers
from prov.constants import PROV, XSD, PROV_ENTITY, PROV_ACTIVITY, PROV_AGENT

Note: The main prov package exports only Error, model, and read functions. Most functionality is accessed through the prov.model module.

Basic Usage

import prov
from prov.model import ProvDocument, Namespace

# Create a new PROV document
doc = ProvDocument()

# Add namespaces
ex = Namespace('ex', 'http://example.org/')
doc.add_namespace(ex)

# Create entities, activities, and agents
entity1 = doc.entity('ex:entity1', {'prov:label': 'My Entity'})
activity1 = doc.activity('ex:activity1', '2023-01-01T00:00:00', '2023-01-01T01:00:00')
agent1 = doc.agent('ex:agent1', {'prov:type': 'prov:Person'})

# Create relationships
doc.wasGeneratedBy(entity1, activity1)
doc.wasAssociatedWith(activity1, agent1)

# Serialize to different formats
print(doc.get_provn())  # PROV-N format
doc.serialize('output.json', format='json')  # PROV-JSON
doc.serialize('output.xml', format='xml')    # PROV-XML

# Read existing PROV documents  
loaded_doc = prov.read('input.json', format='json')

Architecture

The PROV library follows the W3C PROV Data Model structure:

  • ProvDocument: Root container for all provenance information and bundles
  • ProvBundle: Logical grouping of PROV statements with namespace management
  • Elements: Core provenance entities (ProvEntity, ProvActivity, ProvAgent)
  • Relations: Relationships between elements (Generation, Usage, Attribution, etc.)
  • Identifiers: Qualified names and namespaces for globally unique identification
  • Serializers: Format-specific serialization/deserialization (JSON, XML, RDF, N)

This design ensures full compliance with W3C PROV standards while providing intuitive Python APIs for provenance data creation, manipulation, and exchange.

Capabilities

Document and Bundle Management

Core functionality for creating, managing, and organizing PROV documents and bundles. Provides namespace management, record organization, and document-level operations.

class ProvDocument(ProvBundle):
    def serialize(self, destination, format, **args): ...
    @staticmethod
    def deserialize(source, format, **args): ...
    def add_bundle(self, bundle): ...
    def bundle(self, identifier): ...

class ProvBundle:
    def add_namespace(self, namespace): ...
    def set_default_namespace(self, uri): ...
    def entity(self, identifier, attributes=None): ...
    def activity(self, identifier, startTime=None, endTime=None, other_attributes=None): ...
    def agent(self, identifier, other_attributes=None): ...

def read(source, format=None): ...

Document Management

PROV Elements

Classes representing the core PROV elements: entities (things), activities (processes), and agents (responsible parties). Each element supports relationships and attribute management.

class ProvEntity(ProvElement):
    def wasGeneratedBy(self, activity, time=None, attributes=None): ...
    def wasInvalidatedBy(self, activity, time=None, attributes=None): ...
    def wasDerivedFrom(self, other_entity, activity=None, generation=None, usage=None, attributes=None): ...
    def wasAttributedTo(self, agent, attributes=None): ...

class ProvActivity(ProvElement):
    def used(self, entity, time=None, attributes=None): ...
    def wasInformedBy(self, other_activity, attributes=None): ...
    def wasAssociatedWith(self, agent, plan=None, attributes=None): ...

class ProvAgent(ProvElement):
    def actedOnBehalfOf(self, other_agent, activity=None, attributes=None): ...

PROV Elements

Relationships and Assertions

PROV relationship classes that connect elements together, representing the provenance graph structure. Includes generation, usage, derivation, attribution, and influence relationships.

class ProvGeneration(ProvRelation): ...
class ProvUsage(ProvRelation): ...
class ProvDerivation(ProvRelation): ...
class ProvAttribution(ProvRelation): ...
class ProvAssociation(ProvRelation): ...
class ProvDelegation(ProvRelation): ...
class ProvInfluence(ProvRelation): ...
class ProvSpecialization(ProvRelation): ...
class ProvAlternate(ProvRelation): ...
class ProvMembership(ProvRelation): ...

Relationships

Identifiers and Namespaces

System for creating and managing globally unique identifiers using qualified names and namespaces, following URI and W3C standards.

class Identifier:
    def __init__(self, uri: str): ...
    @property
    def uri(self) -> str: ...
    def provn_representation(self) -> str: ...

class QualifiedName(Identifier):
    @property
    def namespace(self) -> Namespace: ...
    @property  
    def localpart(self) -> str: ...

class Namespace:
    def __init__(self, prefix: str, uri: str): ...
    def qname(self, localpart: str) -> QualifiedName: ...
    def __getitem__(self, localpart: str) -> QualifiedName: ...

Identifiers

Serialization and Formats

Comprehensive serialization support for multiple PROV formats including PROV-JSON, PROV-XML, PROV-O (RDF), and PROV-N, with automatic format detection.

class Serializer:
    def serialize(self, stream, **args): ...
    def deserialize(self, stream, **args): ...

class Registry:
    serializers: dict[str, type[Serializer]]
    @staticmethod
    def load_serializers(): ...

def get(format_name: str) -> type[Serializer]: ...

Serialization

Graph Analysis and Visualization

Integration with NetworkX for graph analysis and visualization capabilities, including conversion to/from graph formats and DOT export for graphical rendering.

def prov_to_graph(prov_document: ProvDocument) -> nx.MultiDiGraph: ...
def graph_to_prov(g: nx.MultiDiGraph) -> ProvDocument: ...
def prov_to_dot(bundle: ProvBundle, show_nary: bool = True, use_labels: bool = False, 
                direction: str = "BT", show_element_attributes: bool = True,
                show_relation_attributes: bool = True): ...

Visualization

Command Line Tools

The prov package provides command line tools for converting and comparing PROV documents.

# Command line scripts (installed as executables)
# prov-convert: Convert between PROV formats (JSON, XML, RDF, graphical)
# prov-compare: Compare two PROV documents for equivalence

Script functionality:

  • prov-convert: Converts PROV documents between formats (JSON, XML, RDF, PROV-N) and exports to graphical formats (SVG, PDF, PNG)
  • prov-compare: Compares two PROV documents for logical equivalence across different formats

Constants and Types

# Core PROV constants (from prov.constants)
PROV_ENTITY: QualifiedName
PROV_ACTIVITY: QualifiedName  
PROV_AGENT: QualifiedName
PROV_GENERATION: QualifiedName
PROV_USAGE: QualifiedName
PROV_DERIVATION: QualifiedName
PROV_ATTRIBUTION: QualifiedName
PROV_ASSOCIATION: QualifiedName
PROV_DELEGATION: QualifiedName
PROV_INFLUENCE: QualifiedName
PROV_START: QualifiedName
PROV_END: QualifiedName
PROV_INVALIDATION: QualifiedName
PROV_COMMUNICATION: QualifiedName
PROV_SPECIALIZATION: QualifiedName
PROV_ALTERNATE: QualifiedName
PROV_MENTION: QualifiedName
PROV_MEMBERSHIP: QualifiedName
PROV_BUNDLE: QualifiedName

# Namespaces
PROV: Namespace  # W3C PROV namespace ("prov", "http://www.w3.org/ns/prov#")
XSD: Namespace   # XSD namespace ("xsd", "http://www.w3.org/2001/XMLSchema#")
XSI: Namespace   # XSI namespace ("xsi", "http://www.w3.org/2001/XMLSchema-instance")

# PROV attribute identifiers
PROV_TYPE: QualifiedName
PROV_LABEL: QualifiedName
PROV_VALUE: QualifiedName
PROV_LOCATION: QualifiedName  
PROV_ROLE: QualifiedName

# XSD data types
XSD_ANYURI: QualifiedName
XSD_QNAME: QualifiedName
XSD_DATETIME: QualifiedName
XSD_STRING: QualifiedName
XSD_BOOLEAN: QualifiedName
XSD_INTEGER: QualifiedName
XSD_DOUBLE: QualifiedName
XSD_FLOAT: QualifiedName

# Exception classes
class Error(Exception): ...
class ProvException(Error): ...
class ProvElementIdentifierRequired(ProvException): ...
class ProvWarning(Warning): ...
class ProvExceptionInvalidQualifiedName(ProvException): ...

# Utility classes
class Literal:
    def __init__(self, value, datatype=None, langtag=None): ...
    @property
    def value(self) -> str: ...
    @property  
    def datatype(self) -> QualifiedName | None: ...
    @property
    def langtag(self) -> str | None: ...