0
# Papermill
1
2
A comprehensive Python library for parameterizing, executing, and analyzing Jupyter notebooks at scale. Papermill enables data scientists and engineers to create reusable, parameterized notebook workflows by automatically injecting parameters into designated cells, executing notebooks programmatically through both Python API and command-line interface, and supporting various storage backends including local filesystem, AWS S3, Azure Blob/DataLake Store, and Google Cloud Storage.
3
4
## Package Information
5
6
- **Package Name**: papermill
7
- **Language**: Python
8
- **Installation**: `pip install papermill`
9
10
For additional storage backends and features:
11
```bash
12
pip install papermill[all] # All optional dependencies
13
pip install papermill[s3] # Amazon S3 support
14
pip install papermill[azure] # Azure storage support
15
pip install papermill[gcs] # Google Cloud Storage support
16
```
17
18
## Core Imports
19
20
```python
21
import papermill as pm
22
```
23
24
For specific functionality:
25
```python
26
from papermill import execute_notebook, inspect_notebook
27
from papermill import PapermillException, PapermillExecutionError
28
from papermill.exceptions import PapermillMissingParameterException, PapermillRateLimitException, PapermillOptionalDependencyException
29
from papermill.iorw import load_notebook_node, write_ipynb, list_notebook_files
30
```
31
32
## Basic Usage
33
34
```python
35
import papermill as pm
36
37
# Execute a notebook with parameters
38
pm.execute_notebook(
39
'input_notebook.ipynb',
40
'output_notebook.ipynb',
41
parameters={
42
'alpha': 0.6,
43
'ratio': 0.1,
44
'iterations': 100
45
}
46
)
47
48
# Inspect notebook parameters before execution
49
params = pm.inspect_notebook('input_notebook.ipynb')
50
print(f"Found parameters: {list(params.keys())}")
51
52
# Execute with additional options
53
pm.execute_notebook(
54
'analysis.ipynb',
55
'results.ipynb',
56
parameters={'data_path': '/path/to/data.csv'},
57
kernel_name='python3',
58
progress_bar=True,
59
log_output=True,
60
cwd='/working/directory'
61
)
62
```
63
64
## Architecture
65
66
Papermill follows a modular architecture designed for scalable notebook execution:
67
68
- **Execution Engine**: Core notebook execution using nbclient with support for multiple execution engines
69
- **Parameter Injection**: Automatic parameter cell injection and notebook parameterization
70
- **I/O Handlers**: Pluggable storage backends supporting local, cloud, and remote storage systems
71
- **Language Translators**: Multi-language parameter translation for Python, R, Scala, Julia, MATLAB, and more
72
- **CLI Interface**: Command-line tool for batch processing and automation workflows
73
74
This design enables papermill to integrate with production data pipeline environments, supporting automated reporting, batch processing, and reproducible data analysis workflows across different storage systems and execution environments.
75
76
## Capabilities
77
78
### Notebook Execution
79
80
Core functionality for executing Jupyter notebooks with parameter injection, supporting various execution options, progress tracking, and error handling.
81
82
```python { .api }
83
def execute_notebook(
84
input_path: str | Path | nbformat.NotebookNode,
85
output_path: str | Path | None,
86
parameters: dict = None,
87
engine_name: str = None,
88
request_save_on_cell_execute: bool = True,
89
prepare_only: bool = False,
90
kernel_name: str = None,
91
language: str = None,
92
progress_bar: bool = True,
93
log_output: bool = False,
94
stdout_file = None,
95
stderr_file = None,
96
start_timeout: int = 60,
97
report_mode: bool = False,
98
cwd: str | Path = None,
99
**engine_kwargs
100
) -> nbformat.NotebookNode: ...
101
```
102
103
[Notebook Execution](./execution.md)
104
105
### Parameter Inspection
106
107
Tools for analyzing and inspecting notebook parameters before execution, enabling validation and dynamic parameter discovery.
108
109
```python { .api }
110
def inspect_notebook(
111
notebook_path: str | Path,
112
parameters: dict = None
113
) -> dict[str, dict]: ...
114
```
115
116
[Parameter Inspection](./inspection.md)
117
118
### Storage Backends
119
120
Support for multiple storage systems including local filesystem, cloud storage (S3, Azure, GCS), distributed filesystems (HDFS), and remote repositories (GitHub).
121
122
```python { .api }
123
def load_notebook_node(notebook_path: str) -> nbformat.NotebookNode: ...
124
def write_ipynb(nb: nbformat.NotebookNode, path: str) -> None: ...
125
def list_notebook_files(path: str) -> list[str]: ...
126
```
127
128
[Storage Backends](./storage.md)
129
130
### Command Line Interface
131
132
Comprehensive CLI for batch notebook execution, parameter passing, and automation workflows with support for YAML configuration files and various parameter input formats.
133
134
```bash
135
papermill input.ipynb output.ipynb -p param1 value1 -p param2 value2
136
papermill input.ipynb output.ipynb -f parameters.yaml
137
```
138
139
[Command Line Interface](./cli.md)
140
141
### Exception Handling
142
143
Comprehensive exception hierarchy for handling execution errors, missing parameters, storage issues, and optional dependency problems.
144
145
```python { .api }
146
class PapermillException(Exception): ...
147
class PapermillExecutionError(PapermillException): ...
148
class PapermillMissingParameterException(PapermillException): ...
149
```
150
151
[Exception Handling](./exceptions.md)
152
153
## Types
154
155
```python { .api }
156
from collections import namedtuple
157
from typing import Any, Dict, List, Optional, Union
158
import nbformat
159
160
# Parameter representation
161
Parameter = namedtuple('Parameter', [
162
'name', # str: Parameter name
163
'inferred_type_name', # str: String representation of inferred type
164
'default', # str: String representation of default value
165
'help' # str: Help text/description
166
])
167
168
# Common type aliases
169
NotebookPath = Union[str, Path, nbformat.NotebookNode]
170
Parameters = Optional[Dict[str, Any]]
171
```