0
# PROV
1
2
A comprehensive Python library for working with W3C Provenance Data Model (PROV-DM). PROV enables developers to create, manipulate, and exchange provenance information in various standardized formats including PROV-O (RDF), PROV-XML, PROV-JSON, and PROV-N. The library provides in-memory classes for PROV assertions, serialization/deserialization capabilities, graphical export functionality, and seamless integration with NetworkX for graph analysis.
3
4
## Package Information
5
6
- **Package Name**: prov
7
- **Language**: Python
8
- **Installation**: `pip install prov`
9
- **Optional Dependencies**:
10
- `pip install prov[rdf]` for RDF support
11
- `pip install prov[xml]` for XML support
12
13
## Core Imports
14
15
```python
16
import prov
17
from prov.model import ProvDocument, ProvBundle, Namespace, Literal
18
from prov.identifier import QualifiedName, Identifier
19
```
20
21
For specific functionality:
22
23
```python
24
from prov.graph import prov_to_graph, graph_to_prov
25
from prov.dot import prov_to_dot
26
from prov import serializers
27
from prov.constants import PROV, XSD, PROV_ENTITY, PROV_ACTIVITY, PROV_AGENT
28
```
29
30
Note: The main `prov` package exports only `Error`, `model`, and `read` functions. Most functionality is accessed through the `prov.model` module.
31
32
## Basic Usage
33
34
```python
35
import prov
36
from prov.model import ProvDocument, Namespace
37
38
# Create a new PROV document
39
doc = ProvDocument()
40
41
# Add namespaces
42
ex = Namespace('ex', 'http://example.org/')
43
doc.add_namespace(ex)
44
45
# Create entities, activities, and agents
46
entity1 = doc.entity('ex:entity1', {'prov:label': 'My Entity'})
47
activity1 = doc.activity('ex:activity1', '2023-01-01T00:00:00', '2023-01-01T01:00:00')
48
agent1 = doc.agent('ex:agent1', {'prov:type': 'prov:Person'})
49
50
# Create relationships
51
doc.wasGeneratedBy(entity1, activity1)
52
doc.wasAssociatedWith(activity1, agent1)
53
54
# Serialize to different formats
55
print(doc.get_provn()) # PROV-N format
56
doc.serialize('output.json', format='json') # PROV-JSON
57
doc.serialize('output.xml', format='xml') # PROV-XML
58
59
# Read existing PROV documents
60
loaded_doc = prov.read('input.json', format='json')
61
```
62
63
## Architecture
64
65
The PROV library follows the W3C PROV Data Model structure:
66
67
- **ProvDocument**: Root container for all provenance information and bundles
68
- **ProvBundle**: Logical grouping of PROV statements with namespace management
69
- **Elements**: Core provenance entities (ProvEntity, ProvActivity, ProvAgent)
70
- **Relations**: Relationships between elements (Generation, Usage, Attribution, etc.)
71
- **Identifiers**: Qualified names and namespaces for globally unique identification
72
- **Serializers**: Format-specific serialization/deserialization (JSON, XML, RDF, N)
73
74
This design ensures full compliance with W3C PROV standards while providing intuitive Python APIs for provenance data creation, manipulation, and exchange.
75
76
## Capabilities
77
78
### Document and Bundle Management
79
80
Core functionality for creating, managing, and organizing PROV documents and bundles. Provides namespace management, record organization, and document-level operations.
81
82
```python { .api }
83
class ProvDocument(ProvBundle):
84
def serialize(self, destination, format, **args): ...
85
@staticmethod
86
def deserialize(source, format, **args): ...
87
def add_bundle(self, bundle): ...
88
def bundle(self, identifier): ...
89
90
class ProvBundle:
91
def add_namespace(self, namespace): ...
92
def set_default_namespace(self, uri): ...
93
def entity(self, identifier, attributes=None): ...
94
def activity(self, identifier, startTime=None, endTime=None, other_attributes=None): ...
95
def agent(self, identifier, other_attributes=None): ...
96
97
def read(source, format=None): ...
98
```
99
100
[Document Management](./document-management.md)
101
102
### PROV Elements
103
104
Classes representing the core PROV elements: entities (things), activities (processes), and agents (responsible parties). Each element supports relationships and attribute management.
105
106
```python { .api }
107
class ProvEntity(ProvElement):
108
def wasGeneratedBy(self, activity, time=None, attributes=None): ...
109
def wasInvalidatedBy(self, activity, time=None, attributes=None): ...
110
def wasDerivedFrom(self, other_entity, activity=None, generation=None, usage=None, attributes=None): ...
111
def wasAttributedTo(self, agent, attributes=None): ...
112
113
class ProvActivity(ProvElement):
114
def used(self, entity, time=None, attributes=None): ...
115
def wasInformedBy(self, other_activity, attributes=None): ...
116
def wasAssociatedWith(self, agent, plan=None, attributes=None): ...
117
118
class ProvAgent(ProvElement):
119
def actedOnBehalfOf(self, other_agent, activity=None, attributes=None): ...
120
```
121
122
[PROV Elements](./prov-elements.md)
123
124
### Relationships and Assertions
125
126
PROV relationship classes that connect elements together, representing the provenance graph structure. Includes generation, usage, derivation, attribution, and influence relationships.
127
128
```python { .api }
129
class ProvGeneration(ProvRelation): ...
130
class ProvUsage(ProvRelation): ...
131
class ProvDerivation(ProvRelation): ...
132
class ProvAttribution(ProvRelation): ...
133
class ProvAssociation(ProvRelation): ...
134
class ProvDelegation(ProvRelation): ...
135
class ProvInfluence(ProvRelation): ...
136
class ProvSpecialization(ProvRelation): ...
137
class ProvAlternate(ProvRelation): ...
138
class ProvMembership(ProvRelation): ...
139
```
140
141
[Relationships](./relationships.md)
142
143
### Identifiers and Namespaces
144
145
System for creating and managing globally unique identifiers using qualified names and namespaces, following URI and W3C standards.
146
147
```python { .api }
148
class Identifier:
149
def __init__(self, uri: str): ...
150
@property
151
def uri(self) -> str: ...
152
def provn_representation(self) -> str: ...
153
154
class QualifiedName(Identifier):
155
@property
156
def namespace(self) -> Namespace: ...
157
@property
158
def localpart(self) -> str: ...
159
160
class Namespace:
161
def __init__(self, prefix: str, uri: str): ...
162
def qname(self, localpart: str) -> QualifiedName: ...
163
def __getitem__(self, localpart: str) -> QualifiedName: ...
164
```
165
166
[Identifiers](./identifiers.md)
167
168
### Serialization and Formats
169
170
Comprehensive serialization support for multiple PROV formats including PROV-JSON, PROV-XML, PROV-O (RDF), and PROV-N, with automatic format detection.
171
172
```python { .api }
173
class Serializer:
174
def serialize(self, stream, **args): ...
175
def deserialize(self, stream, **args): ...
176
177
class Registry:
178
serializers: dict[str, type[Serializer]]
179
@staticmethod
180
def load_serializers(): ...
181
182
def get(format_name: str) -> type[Serializer]: ...
183
```
184
185
[Serialization](./serialization.md)
186
187
### Graph Analysis and Visualization
188
189
Integration with NetworkX for graph analysis and visualization capabilities, including conversion to/from graph formats and DOT export for graphical rendering.
190
191
```python { .api }
192
def prov_to_graph(prov_document: ProvDocument) -> nx.MultiDiGraph: ...
193
def graph_to_prov(g: nx.MultiDiGraph) -> ProvDocument: ...
194
def prov_to_dot(bundle: ProvBundle, show_nary: bool = True, use_labels: bool = False,
195
direction: str = "BT", show_element_attributes: bool = True,
196
show_relation_attributes: bool = True): ...
197
```
198
199
[Visualization](./visualization.md)
200
201
### Command Line Tools
202
203
The prov package provides command line tools for converting and comparing PROV documents.
204
205
```python { .api }
206
# Command line scripts (installed as executables)
207
# prov-convert: Convert between PROV formats (JSON, XML, RDF, graphical)
208
# prov-compare: Compare two PROV documents for equivalence
209
```
210
211
Script functionality:
212
- **prov-convert**: Converts PROV documents between formats (JSON, XML, RDF, PROV-N) and exports to graphical formats (SVG, PDF, PNG)
213
- **prov-compare**: Compares two PROV documents for logical equivalence across different formats
214
215
## Constants and Types
216
217
```python { .api }
218
# Core PROV constants (from prov.constants)
219
PROV_ENTITY: QualifiedName
220
PROV_ACTIVITY: QualifiedName
221
PROV_AGENT: QualifiedName
222
PROV_GENERATION: QualifiedName
223
PROV_USAGE: QualifiedName
224
PROV_DERIVATION: QualifiedName
225
PROV_ATTRIBUTION: QualifiedName
226
PROV_ASSOCIATION: QualifiedName
227
PROV_DELEGATION: QualifiedName
228
PROV_INFLUENCE: QualifiedName
229
PROV_START: QualifiedName
230
PROV_END: QualifiedName
231
PROV_INVALIDATION: QualifiedName
232
PROV_COMMUNICATION: QualifiedName
233
PROV_SPECIALIZATION: QualifiedName
234
PROV_ALTERNATE: QualifiedName
235
PROV_MENTION: QualifiedName
236
PROV_MEMBERSHIP: QualifiedName
237
PROV_BUNDLE: QualifiedName
238
239
# Namespaces
240
PROV: Namespace # W3C PROV namespace ("prov", "http://www.w3.org/ns/prov#")
241
XSD: Namespace # XSD namespace ("xsd", "http://www.w3.org/2001/XMLSchema#")
242
XSI: Namespace # XSI namespace ("xsi", "http://www.w3.org/2001/XMLSchema-instance")
243
244
# PROV attribute identifiers
245
PROV_TYPE: QualifiedName
246
PROV_LABEL: QualifiedName
247
PROV_VALUE: QualifiedName
248
PROV_LOCATION: QualifiedName
249
PROV_ROLE: QualifiedName
250
251
# XSD data types
252
XSD_ANYURI: QualifiedName
253
XSD_QNAME: QualifiedName
254
XSD_DATETIME: QualifiedName
255
XSD_STRING: QualifiedName
256
XSD_BOOLEAN: QualifiedName
257
XSD_INTEGER: QualifiedName
258
XSD_DOUBLE: QualifiedName
259
XSD_FLOAT: QualifiedName
260
261
# Exception classes
262
class Error(Exception): ...
263
class ProvException(Error): ...
264
class ProvElementIdentifierRequired(ProvException): ...
265
class ProvWarning(Warning): ...
266
class ProvExceptionInvalidQualifiedName(ProvException): ...
267
268
# Utility classes
269
class Literal:
270
def __init__(self, value, datatype=None, langtag=None): ...
271
@property
272
def value(self) -> str: ...
273
@property
274
def datatype(self) -> QualifiedName | None: ...
275
@property
276
def langtag(self) -> str | None: ...
277
```