or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

conformance-checking.mdfiltering.mdindex.mdml-organizational.mdobject-centric.mdprocess-discovery.mdreading-writing.mdstatistics-analysis.mdutilities-conversion.mdvisualization.md

index.mddocs/

0

# PM4PY

1

2

A comprehensive Python library for process mining providing extensive functionality for reading, writing, discovering, analyzing, and visualizing process models and event logs. PM4PY supports traditional event logs and Object-Centric Event Logs (OCEL), offering 280+ API functions across multiple process mining paradigms.

3

4

## Package Information

5

6

- **Package Name**: pm4py

7

- **Package Type**: pypi

8

- **Language**: Python

9

- **Installation**: `pip install pm4py`

10

- **Documentation**: [https://pm4py.fit.fraunhofer.de/](https://pm4py.fit.fraunhofer.de/)

11

- **Version**: 2.7.17

12

13

## Core Imports

14

15

```python

16

import pm4py

17

```

18

19

Common pattern for accessing functionality:

20

21

```python

22

# Read event logs

23

from pm4py import read_xes, read_ocel

24

25

# Process discovery

26

from pm4py import discover_petri_net_inductive, discover_dfg

27

28

# Conformance checking

29

from pm4py import fitness_alignments, conformance_diagnostics_alignments

30

31

# Visualization

32

from pm4py import view_petri_net, view_dfg

33

34

# Filtering

35

from pm4py import filter_variants_top_k, filter_start_activities

36

```

37

38

## Basic Usage

39

40

```python

41

import pm4py

42

import pandas as pd

43

44

# Read event log from XES file

45

log = pm4py.read_xes('event_log.xes')

46

47

# Alternative: Work with DataFrame

48

df = pd.read_csv('event_data.csv')

49

log = pm4py.format_dataframe(df, case_id='case_id',

50

activity_key='activity',

51

timestamp_key='timestamp')

52

53

# Process discovery - discover process model

54

net, initial_marking, final_marking = pm4py.discover_petri_net_inductive(log)

55

56

# Conformance checking - measure fitness

57

fitness = pm4py.fitness_alignments(log, net, initial_marking, final_marking)

58

print(f"Fitness: {fitness['log_fitness']}")

59

60

# Visualization

61

pm4py.view_petri_net(net, initial_marking, final_marking)

62

63

# Filtering - keep top 10 most frequent variants

64

filtered_log = pm4py.filter_variants_top_k(log, 10)

65

66

# Statistics

67

start_activities = pm4py.get_start_activities(log)

68

variants = pm4py.get_variants_as_tuples(log)

69

```

70

71

## Architecture

72

73

PM4PY is structured around several key components:

74

75

### Data Objects

76

- **EventLog/DataFrame**: Traditional event logs with case-activity-timestamp structure

77

- **OCEL (Object-Centric Event Logs)**: Multi-dimensional event logs with objects and relationships

78

- **PetriNet**: Process models with places, transitions, and markings

79

- **ProcessTree**: Hierarchical process representations

80

- **BPMN**: Business Process Model and Notation objects

81

82

### Processing Pipeline

83

1. **Data Input**: Read various formats (XES, CSV, PNML, BPMN, OCEL formats)

84

2. **Data Preparation**: Format, filter, and preprocess event data

85

3. **Process Discovery**: Extract process models from event logs

86

4. **Conformance Checking**: Measure model-log alignment and fitness

87

5. **Enhancement**: Enrich models with performance, organizational data

88

6. **Visualization**: Generate visual representations

89

7. **Export**: Write results in multiple formats

90

91

### Algorithm Categories

92

- **Classical Discovery**: Alpha Miner, Heuristics Miner, ILP Miner

93

- **Modern Discovery**: Inductive Miner, POWL, Declare discovery

94

- **Conformance**: Token-based replay, alignments, temporal conformance

95

- **Object-Centric**: OCEL-specific discovery and conformance methods

96

97

## Capabilities

98

99

### I/O Operations

100

101

Comprehensive support for reading and writing process mining data in various formats including XES, PNML, BPMN, and Object-Centric Event Log formats.

102

103

```python { .api }

104

def read_xes(file_path, variant=None, return_legacy_log_object=False, encoding='utf-8', **kwargs): ...

105

def write_xes(log, file_path, case_id_key='case:concept:name', extensions=None, encoding='utf-8', **kwargs): ...

106

def read_ocel(file_path, objects_path=None, encoding='utf-8'): ...

107

def write_ocel(ocel, file_path, objects_path=None, encoding='utf-8'): ...

108

```

109

110

[Reading and Writing Operations](./reading-writing.md)

111

112

### Process Discovery

113

114

Algorithms for discovering process models from event logs, including classical miners (Alpha, Heuristics) and modern techniques (Inductive Miner, POWL).

115

116

```python { .api }

117

def discover_petri_net_inductive(log, noise_threshold=0.0, multi_processing=True, activity_key='concept:name', **kwargs): ...

118

def discover_process_tree_inductive(log, noise_threshold=0.0, multi_processing=True, **kwargs): ...

119

def discover_dfg(log, activity_key='concept:name', timestamp_key='time:timestamp', case_id_key='case:concept:name'): ...

120

def discover_heuristics_net(log, dependency_threshold=0.5, and_threshold=0.65, **kwargs): ...

121

```

122

123

[Process Discovery Algorithms](./process-discovery.md)

124

125

### Conformance Checking

126

127

Methods for measuring how well process models align with event logs, including fitness, precision, and diagnostic capabilities.

128

129

```python { .api }

130

def fitness_alignments(log, petri_net, initial_marking, final_marking, multi_processing=True, **kwargs): ...

131

def conformance_diagnostics_alignments(log, petri_net, initial_marking, final_marking, **kwargs): ...

132

def fitness_token_based_replay(log, petri_net, initial_marking, final_marking, **kwargs): ...

133

def precision_alignments(log, petri_net, initial_marking, final_marking, **kwargs): ...

134

```

135

136

[Conformance Checking and Fitness](./conformance-checking.md)

137

138

### Filtering Operations

139

140

Comprehensive filtering capabilities for event logs and OCEL including behavioral, temporal, organizational, and structural filters.

141

142

```python { .api }

143

def filter_variants_top_k(log, k, activity_key='concept:name', **kwargs): ...

144

def filter_start_activities(log, activities, retain=True, **kwargs): ...

145

def filter_time_range(log, dt1, dt2, **kwargs): ...

146

def filter_case_performance(log, min_performance, max_performance, **kwargs): ...

147

```

148

149

[Filtering Operations](./filtering.md)

150

151

### Visualization

152

153

Extensive visualization capabilities for process models, statistics, and analysis results with both viewing and saving options.

154

155

```python { .api }

156

def view_petri_net(petri_net, initial_marking=None, final_marking=None, format='png', **kwargs): ...

157

def view_dfg(dfg, start_activities=None, end_activities=None, format='png', **kwargs): ...

158

def save_vis_process_tree(tree, file_path, **kwargs): ...

159

def view_dotted_chart(log, **kwargs): ...

160

```

161

162

[Visualization Functions](./visualization.md)

163

164

### Object-Centric Process Mining

165

166

Specialized operations for Object-Centric Event Logs (OCEL) including discovery, analysis, and manipulation of multi-dimensional process data.

167

168

```python { .api }

169

def ocel_flattening(ocel, object_type): ...

170

def discover_ocdfg(ocel, **kwargs): ...

171

def discover_oc_petri_net(ocel, **kwargs): ...

172

def ocel_objects_interactions_summary(ocel): ...

173

```

174

175

[Object-Centric Operations](./object-centric.md)

176

177

### Statistics and Analysis

178

179

Statistical analysis functions for process behavior, performance metrics, and advanced analytical operations.

180

181

```python { .api }

182

def get_variants_as_tuples(log, activity_key='concept:name', **kwargs): ...

183

def get_case_duration(log, timestamp_key='time:timestamp', case_id_key='case:concept:name'): ...

184

def get_start_activities(log, **kwargs): ...

185

def check_soundness(petri_net, initial_marking, final_marking): ...

186

```

187

188

[Statistics and Analysis](./statistics-analysis.md)

189

190

### Utilities and Conversion

191

192

Utility functions for data manipulation, format conversion, and model transformation between different representations.

193

194

```python { .api }

195

def format_dataframe(df, case_id='case:concept:name', activity_key='concept:name', **kwargs): ...

196

def convert_to_petri_net(*args, **kwargs): ...

197

def convert_to_process_tree(*args, **kwargs): ...

198

def serialize(obj, file_path): ...

199

```

200

201

[Utilities and Conversion](./utilities-conversion.md)

202

203

### Machine Learning and Organizational Mining

204

205

Machine learning features for predictive process analytics and organizational mining for resource and social network analysis.

206

207

```python { .api }

208

def extract_features_dataframe(log, **kwargs): ...

209

def split_train_test(log, train_percentage=0.8, **kwargs): ...

210

def discover_handover_of_work_network(log, beta=0, **kwargs): ...

211

def discover_organizational_roles(log, **kwargs): ...

212

```

213

214

[Machine Learning and Organizational Mining](./ml-organizational.md)

215

216

## Types

217

218

Complete type definitions for PM4PY objects referenced in the API.

219

220

```python { .api }

221

# Core Data Types

222

from typing import Dict, List, Tuple, Optional, Union, Any

223

import pandas as pd

224

225

# Event Log Types

226

EventLog = List[Dict[str, Any]] # Collection of events with attributes

227

EventStream = List[Dict[str, Any]] # Ordered sequence of events

228

229

# Process Model Types

230

class PetriNet:

231

"""Petri net with places, transitions, and arcs."""

232

places: List[Any]

233

transitions: List[Any]

234

arcs: List[Any]

235

236

class ProcessTree:

237

"""Hierarchical process tree representation."""

238

operator: str

239

children: List['ProcessTree']

240

label: Optional[str]

241

242

class BPMN:

243

"""Business Process Model and Notation object."""

244

nodes: List[Any]

245

flows: List[Any]

246

247

class HeuristicsNet:

248

"""Heuristics net representation."""

249

activities: List[str]

250

dependencies: Dict[Tuple[str, str], float]

251

252

# Discovery Types

253

DFG = Dict[Tuple[str, str], int] # Directly-Follows Graph

254

PerformanceDFG = Dict[Tuple[str, str], float] # Performance-annotated DFG

255

256

# OCEL Types

257

class OCEL:

258

"""Object-Centric Event Log."""

259

events: pd.DataFrame

260

objects: pd.DataFrame

261

relations: pd.DataFrame

262

263

class OCDFG:

264

"""Object-Centric Directly-Follows Graph."""

265

activities: List[str]

266

objects: List[str]

267

edges: Dict[Tuple[str, str], int]

268

269

# Conformance Types

270

AlignmentResult = Dict[str, Any] # Alignment computation results

271

FitnessResult = Dict[str, float] # Fitness measurement results

272

ReplayResult = Dict[str, Any] # Token-based replay results

273

274

# Marking Types

275

Marking = Dict[Any, int] # Petri net marking (place -> tokens)

276

277

# Analysis Types

278

VariantDict = Dict[Tuple[str, ...], int] # Process variants with frequencies

279

CaseDuration = Dict[str, float] # Case durations by case ID

280

ActivityStats = Dict[str, Any] # Activity statistics

281

```