or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

cli.mdexceptions.mdexecution.mdindex.mdinspection.mdstorage.md
tile.json

tessl/pypi-papermill

Parameterize and run Jupyter and nteract Notebooks

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/papermill@2.6.x

To install, run

npx @tessl/cli install tessl/pypi-papermill@2.6.0

index.mddocs/

Papermill

A comprehensive Python library for parameterizing, executing, and analyzing Jupyter notebooks at scale. Papermill enables data scientists and engineers to create reusable, parameterized notebook workflows by automatically injecting parameters into designated cells, executing notebooks programmatically through both Python API and command-line interface, and supporting various storage backends including local filesystem, AWS S3, Azure Blob/DataLake Store, and Google Cloud Storage.

Package Information

  • Package Name: papermill
  • Language: Python
  • Installation: pip install papermill

For additional storage backends and features:

pip install papermill[all]  # All optional dependencies
pip install papermill[s3]   # Amazon S3 support
pip install papermill[azure] # Azure storage support
pip install papermill[gcs]  # Google Cloud Storage support

Core Imports

import papermill as pm

For specific functionality:

from papermill import execute_notebook, inspect_notebook
from papermill import PapermillException, PapermillExecutionError
from papermill.exceptions import PapermillMissingParameterException, PapermillRateLimitException, PapermillOptionalDependencyException
from papermill.iorw import load_notebook_node, write_ipynb, list_notebook_files

Basic Usage

import papermill as pm

# Execute a notebook with parameters
pm.execute_notebook(
    'input_notebook.ipynb',
    'output_notebook.ipynb',
    parameters={
        'alpha': 0.6,
        'ratio': 0.1,
        'iterations': 100
    }
)

# Inspect notebook parameters before execution
params = pm.inspect_notebook('input_notebook.ipynb')
print(f"Found parameters: {list(params.keys())}")

# Execute with additional options
pm.execute_notebook(
    'analysis.ipynb',
    'results.ipynb',
    parameters={'data_path': '/path/to/data.csv'},
    kernel_name='python3',
    progress_bar=True,
    log_output=True,
    cwd='/working/directory'
)

Architecture

Papermill follows a modular architecture designed for scalable notebook execution:

  • Execution Engine: Core notebook execution using nbclient with support for multiple execution engines
  • Parameter Injection: Automatic parameter cell injection and notebook parameterization
  • I/O Handlers: Pluggable storage backends supporting local, cloud, and remote storage systems
  • Language Translators: Multi-language parameter translation for Python, R, Scala, Julia, MATLAB, and more
  • CLI Interface: Command-line tool for batch processing and automation workflows

This design enables papermill to integrate with production data pipeline environments, supporting automated reporting, batch processing, and reproducible data analysis workflows across different storage systems and execution environments.

Capabilities

Notebook Execution

Core functionality for executing Jupyter notebooks with parameter injection, supporting various execution options, progress tracking, and error handling.

def execute_notebook(
    input_path: str | Path | nbformat.NotebookNode,
    output_path: str | Path | None,
    parameters: dict = None,
    engine_name: str = None,
    request_save_on_cell_execute: bool = True,
    prepare_only: bool = False,
    kernel_name: str = None,
    language: str = None,
    progress_bar: bool = True,
    log_output: bool = False,
    stdout_file = None,
    stderr_file = None,
    start_timeout: int = 60,
    report_mode: bool = False,
    cwd: str | Path = None,
    **engine_kwargs
) -> nbformat.NotebookNode: ...

Notebook Execution

Parameter Inspection

Tools for analyzing and inspecting notebook parameters before execution, enabling validation and dynamic parameter discovery.

def inspect_notebook(
    notebook_path: str | Path,
    parameters: dict = None
) -> dict[str, dict]: ...

Parameter Inspection

Storage Backends

Support for multiple storage systems including local filesystem, cloud storage (S3, Azure, GCS), distributed filesystems (HDFS), and remote repositories (GitHub).

def load_notebook_node(notebook_path: str) -> nbformat.NotebookNode: ...
def write_ipynb(nb: nbformat.NotebookNode, path: str) -> None: ...
def list_notebook_files(path: str) -> list[str]: ...

Storage Backends

Command Line Interface

Comprehensive CLI for batch notebook execution, parameter passing, and automation workflows with support for YAML configuration files and various parameter input formats.

papermill input.ipynb output.ipynb -p param1 value1 -p param2 value2
papermill input.ipynb output.ipynb -f parameters.yaml

Command Line Interface

Exception Handling

Comprehensive exception hierarchy for handling execution errors, missing parameters, storage issues, and optional dependency problems.

class PapermillException(Exception): ...
class PapermillExecutionError(PapermillException): ...
class PapermillMissingParameterException(PapermillException): ...

Exception Handling

Types

from collections import namedtuple
from typing import Any, Dict, List, Optional, Union
import nbformat

# Parameter representation
Parameter = namedtuple('Parameter', [
    'name',           # str: Parameter name
    'inferred_type_name',  # str: String representation of inferred type
    'default',        # str: String representation of default value  
    'help'           # str: Help text/description
])

# Common type aliases
NotebookPath = Union[str, Path, nbformat.NotebookNode]
Parameters = Optional[Dict[str, Any]]