or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-prov

A library for W3C Provenance Data Model supporting PROV-JSON, PROV-XML and PROV-O (RDF)

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/prov@2.1.x

To install, run

npx @tessl/cli install tessl/pypi-prov@2.1.0

0

# PROV

1

2

A comprehensive Python library for working with W3C Provenance Data Model (PROV-DM). PROV enables developers to create, manipulate, and exchange provenance information in various standardized formats including PROV-O (RDF), PROV-XML, PROV-JSON, and PROV-N. The library provides in-memory classes for PROV assertions, serialization/deserialization capabilities, graphical export functionality, and seamless integration with NetworkX for graph analysis.

3

4

## Package Information

5

6

- **Package Name**: prov

7

- **Language**: Python

8

- **Installation**: `pip install prov`

9

- **Optional Dependencies**:

10

- `pip install prov[rdf]` for RDF support

11

- `pip install prov[xml]` for XML support

12

13

## Core Imports

14

15

```python

16

import prov

17

from prov.model import ProvDocument, ProvBundle, Namespace, Literal

18

from prov.identifier import QualifiedName, Identifier

19

```

20

21

For specific functionality:

22

23

```python

24

from prov.graph import prov_to_graph, graph_to_prov

25

from prov.dot import prov_to_dot

26

from prov import serializers

27

from prov.constants import PROV, XSD, PROV_ENTITY, PROV_ACTIVITY, PROV_AGENT

28

```

29

30

Note: The main `prov` package exports only `Error`, `model`, and `read` functions. Most functionality is accessed through the `prov.model` module.

31

32

## Basic Usage

33

34

```python

35

import prov

36

from prov.model import ProvDocument, Namespace

37

38

# Create a new PROV document

39

doc = ProvDocument()

40

41

# Add namespaces

42

ex = Namespace('ex', 'http://example.org/')

43

doc.add_namespace(ex)

44

45

# Create entities, activities, and agents

46

entity1 = doc.entity('ex:entity1', {'prov:label': 'My Entity'})

47

activity1 = doc.activity('ex:activity1', '2023-01-01T00:00:00', '2023-01-01T01:00:00')

48

agent1 = doc.agent('ex:agent1', {'prov:type': 'prov:Person'})

49

50

# Create relationships

51

doc.wasGeneratedBy(entity1, activity1)

52

doc.wasAssociatedWith(activity1, agent1)

53

54

# Serialize to different formats

55

print(doc.get_provn()) # PROV-N format

56

doc.serialize('output.json', format='json') # PROV-JSON

57

doc.serialize('output.xml', format='xml') # PROV-XML

58

59

# Read existing PROV documents

60

loaded_doc = prov.read('input.json', format='json')

61

```

62

63

## Architecture

64

65

The PROV library follows the W3C PROV Data Model structure:

66

67

- **ProvDocument**: Root container for all provenance information and bundles

68

- **ProvBundle**: Logical grouping of PROV statements with namespace management

69

- **Elements**: Core provenance entities (ProvEntity, ProvActivity, ProvAgent)

70

- **Relations**: Relationships between elements (Generation, Usage, Attribution, etc.)

71

- **Identifiers**: Qualified names and namespaces for globally unique identification

72

- **Serializers**: Format-specific serialization/deserialization (JSON, XML, RDF, N)

73

74

This design ensures full compliance with W3C PROV standards while providing intuitive Python APIs for provenance data creation, manipulation, and exchange.

75

76

## Capabilities

77

78

### Document and Bundle Management

79

80

Core functionality for creating, managing, and organizing PROV documents and bundles. Provides namespace management, record organization, and document-level operations.

81

82

```python { .api }

83

class ProvDocument(ProvBundle):

84

def serialize(self, destination, format, **args): ...

85

@staticmethod

86

def deserialize(source, format, **args): ...

87

def add_bundle(self, bundle): ...

88

def bundle(self, identifier): ...

89

90

class ProvBundle:

91

def add_namespace(self, namespace): ...

92

def set_default_namespace(self, uri): ...

93

def entity(self, identifier, attributes=None): ...

94

def activity(self, identifier, startTime=None, endTime=None, other_attributes=None): ...

95

def agent(self, identifier, other_attributes=None): ...

96

97

def read(source, format=None): ...

98

```

99

100

[Document Management](./document-management.md)

101

102

### PROV Elements

103

104

Classes representing the core PROV elements: entities (things), activities (processes), and agents (responsible parties). Each element supports relationships and attribute management.

105

106

```python { .api }

107

class ProvEntity(ProvElement):

108

def wasGeneratedBy(self, activity, time=None, attributes=None): ...

109

def wasInvalidatedBy(self, activity, time=None, attributes=None): ...

110

def wasDerivedFrom(self, other_entity, activity=None, generation=None, usage=None, attributes=None): ...

111

def wasAttributedTo(self, agent, attributes=None): ...

112

113

class ProvActivity(ProvElement):

114

def used(self, entity, time=None, attributes=None): ...

115

def wasInformedBy(self, other_activity, attributes=None): ...

116

def wasAssociatedWith(self, agent, plan=None, attributes=None): ...

117

118

class ProvAgent(ProvElement):

119

def actedOnBehalfOf(self, other_agent, activity=None, attributes=None): ...

120

```

121

122

[PROV Elements](./prov-elements.md)

123

124

### Relationships and Assertions

125

126

PROV relationship classes that connect elements together, representing the provenance graph structure. Includes generation, usage, derivation, attribution, and influence relationships.

127

128

```python { .api }

129

class ProvGeneration(ProvRelation): ...

130

class ProvUsage(ProvRelation): ...

131

class ProvDerivation(ProvRelation): ...

132

class ProvAttribution(ProvRelation): ...

133

class ProvAssociation(ProvRelation): ...

134

class ProvDelegation(ProvRelation): ...

135

class ProvInfluence(ProvRelation): ...

136

class ProvSpecialization(ProvRelation): ...

137

class ProvAlternate(ProvRelation): ...

138

class ProvMembership(ProvRelation): ...

139

```

140

141

[Relationships](./relationships.md)

142

143

### Identifiers and Namespaces

144

145

System for creating and managing globally unique identifiers using qualified names and namespaces, following URI and W3C standards.

146

147

```python { .api }

148

class Identifier:

149

def __init__(self, uri: str): ...

150

@property

151

def uri(self) -> str: ...

152

def provn_representation(self) -> str: ...

153

154

class QualifiedName(Identifier):

155

@property

156

def namespace(self) -> Namespace: ...

157

@property

158

def localpart(self) -> str: ...

159

160

class Namespace:

161

def __init__(self, prefix: str, uri: str): ...

162

def qname(self, localpart: str) -> QualifiedName: ...

163

def __getitem__(self, localpart: str) -> QualifiedName: ...

164

```

165

166

[Identifiers](./identifiers.md)

167

168

### Serialization and Formats

169

170

Comprehensive serialization support for multiple PROV formats including PROV-JSON, PROV-XML, PROV-O (RDF), and PROV-N, with automatic format detection.

171

172

```python { .api }

173

class Serializer:

174

def serialize(self, stream, **args): ...

175

def deserialize(self, stream, **args): ...

176

177

class Registry:

178

serializers: dict[str, type[Serializer]]

179

@staticmethod

180

def load_serializers(): ...

181

182

def get(format_name: str) -> type[Serializer]: ...

183

```

184

185

[Serialization](./serialization.md)

186

187

### Graph Analysis and Visualization

188

189

Integration with NetworkX for graph analysis and visualization capabilities, including conversion to/from graph formats and DOT export for graphical rendering.

190

191

```python { .api }

192

def prov_to_graph(prov_document: ProvDocument) -> nx.MultiDiGraph: ...

193

def graph_to_prov(g: nx.MultiDiGraph) -> ProvDocument: ...

194

def prov_to_dot(bundle: ProvBundle, show_nary: bool = True, use_labels: bool = False,

195

direction: str = "BT", show_element_attributes: bool = True,

196

show_relation_attributes: bool = True): ...

197

```

198

199

[Visualization](./visualization.md)

200

201

### Command Line Tools

202

203

The prov package provides command line tools for converting and comparing PROV documents.

204

205

```python { .api }

206

# Command line scripts (installed as executables)

207

# prov-convert: Convert between PROV formats (JSON, XML, RDF, graphical)

208

# prov-compare: Compare two PROV documents for equivalence

209

```

210

211

Script functionality:

212

- **prov-convert**: Converts PROV documents between formats (JSON, XML, RDF, PROV-N) and exports to graphical formats (SVG, PDF, PNG)

213

- **prov-compare**: Compares two PROV documents for logical equivalence across different formats

214

215

## Constants and Types

216

217

```python { .api }

218

# Core PROV constants (from prov.constants)

219

PROV_ENTITY: QualifiedName

220

PROV_ACTIVITY: QualifiedName

221

PROV_AGENT: QualifiedName

222

PROV_GENERATION: QualifiedName

223

PROV_USAGE: QualifiedName

224

PROV_DERIVATION: QualifiedName

225

PROV_ATTRIBUTION: QualifiedName

226

PROV_ASSOCIATION: QualifiedName

227

PROV_DELEGATION: QualifiedName

228

PROV_INFLUENCE: QualifiedName

229

PROV_START: QualifiedName

230

PROV_END: QualifiedName

231

PROV_INVALIDATION: QualifiedName

232

PROV_COMMUNICATION: QualifiedName

233

PROV_SPECIALIZATION: QualifiedName

234

PROV_ALTERNATE: QualifiedName

235

PROV_MENTION: QualifiedName

236

PROV_MEMBERSHIP: QualifiedName

237

PROV_BUNDLE: QualifiedName

238

239

# Namespaces

240

PROV: Namespace # W3C PROV namespace ("prov", "http://www.w3.org/ns/prov#")

241

XSD: Namespace # XSD namespace ("xsd", "http://www.w3.org/2001/XMLSchema#")

242

XSI: Namespace # XSI namespace ("xsi", "http://www.w3.org/2001/XMLSchema-instance")

243

244

# PROV attribute identifiers

245

PROV_TYPE: QualifiedName

246

PROV_LABEL: QualifiedName

247

PROV_VALUE: QualifiedName

248

PROV_LOCATION: QualifiedName

249

PROV_ROLE: QualifiedName

250

251

# XSD data types

252

XSD_ANYURI: QualifiedName

253

XSD_QNAME: QualifiedName

254

XSD_DATETIME: QualifiedName

255

XSD_STRING: QualifiedName

256

XSD_BOOLEAN: QualifiedName

257

XSD_INTEGER: QualifiedName

258

XSD_DOUBLE: QualifiedName

259

XSD_FLOAT: QualifiedName

260

261

# Exception classes

262

class Error(Exception): ...

263

class ProvException(Error): ...

264

class ProvElementIdentifierRequired(ProvException): ...

265

class ProvWarning(Warning): ...

266

class ProvExceptionInvalidQualifiedName(ProvException): ...

267

268

# Utility classes

269

class Literal:

270

def __init__(self, value, datatype=None, langtag=None): ...

271

@property

272

def value(self) -> str: ...

273

@property

274

def datatype(self) -> QualifiedName | None: ...

275

@property

276

def langtag(self) -> str | None: ...

277

```