CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-papermill

Parameterize and run Jupyter and nteract Notebooks

Pending
Overview
Eval results
Files

cli.mddocs/

Command Line Interface

Comprehensive CLI for batch notebook execution, parameter passing, and automation workflows with support for YAML configuration files and various parameter input formats. The papermill command provides a powerful interface for integrating notebooks into data pipelines and automation systems.

Capabilities

Main Command

Primary CLI entry point for executing notebooks with parameters.

papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]

Options:
  --help-notebook           Display parameters information for the given notebook path
  -p, --parameters TEXT     Parameters to pass to the parameters cell (key value pairs)
  -r, --parameters_raw TEXT Parameters to be read as raw string (key value pairs)
  -f, --parameters_file TEXT Path to YAML file containing parameters
  -y, --parameters_yaml TEXT YAML string to be used as parameters
  -b, --parameters_base64 TEXT Base64 encoded YAML string as parameters
  --inject-input-path       Insert the path of the input notebook as PAPERMILL_INPUT_PATH as a notebook parameter
  --inject-output-path      Insert the path of the output notebook as PAPERMILL_OUTPUT_PATH as a notebook parameter
  --inject-paths            Insert the paths of input/output notebooks as PAPERMILL_INPUT_PATH/PAPERMILL_OUTPUT_PATH as notebook parameters
  --engine TEXT             The execution engine name to use in evaluating the notebook
  --request-save-on-cell-execute / --no-request-save-on-cell-execute Request save notebook after each cell execution
  --autosave-cell-every INTEGER How often in seconds to autosave the notebook during long cell executions (0 to disable)
  --prepare-only / --prepare-execute Flag for outputting the notebook without execution, but with parameters applied
  --kernel, -k TEXT         Name of kernel to run. Ignores kernel name in the notebook document metadata
  --language, -l TEXT       Language for notebook execution. Ignores language in the notebook document metadata
  --cwd TEXT                Working directory to run notebook in
  --progress-bar / --no-progress-bar Flag for turning on the progress bar
  --log-output / --no-log-output Flag for writing notebook output to the configured logger
  --stdout-file FILE        File to write notebook stdout output to
  --stderr-file FILE        File to write notebook stderr output to
  --log-level [NOTSET|DEBUG|INFO|WARNING|ERROR|CRITICAL] Set log level
  --start-timeout INTEGER   Time in seconds to wait for kernel to start
  --execution-timeout INTEGER Time in seconds to wait for each cell before failing execution (default: forever)
  --report-mode / --no-report-mode Flag for hiding input
  --version                 Show version and exit
  -h, --help                Show this message and exit

CLI Function Interface

Python interface to the CLI command.

def papermill(
    ctx: click.Context,
    notebook_path: str,
    output_path: str = "",
    help_notebook: bool = False,
    parameters: tuple = (),
    parameters_raw: tuple = (),
    parameters_file: tuple = (),
    parameters_yaml: tuple = (),
    parameters_base64: tuple = (),
    inject_input_path: bool = False,
    inject_output_path: bool = False,
    inject_paths: bool = False,
    engine: str = None,
    prepare_only: bool = False,
    kernel: str = None,
    language: str = None,
    log_output: bool = False,
    stdout_file = None,
    stderr_file = None,
    no_progress_bar: bool = False,
    autosave_cell_every: int = None,
    start_timeout: int = 60,
    execution_timeout: int = None,
    report_mode: bool = False,
    cwd: str = None,
    version: bool = False
) -> None:
    """
    Main CLI interface for papermill notebook execution.
    
    This is the Click command function that handles all CLI operations
    including parameter parsing, validation, and execution coordination.
    """

Usage Examples

Basic Execution

# Execute notebook with output
papermill input.ipynb output.ipynb

# Execute without saving output
papermill input.ipynb

Parameter Passing

# Single parameters
papermill analysis.ipynb results.ipynb -p alpha 0.6 -p iterations 100

# Raw string parameters (no type conversion)
papermill notebook.ipynb output.ipynb -r config_file "/path/with spaces/config.json"

# Multiple parameter methods
papermill notebook.ipynb output.ipynb \
  -p threshold 0.8 \
  -r data_path "/complex path/data.csv" \
  -f config.yaml

Parameter Files

# YAML parameter file
papermill experiment.ipynb result.ipynb -f parameters.yaml

# Inline YAML parameters
papermill notebook.ipynb output.ipynb -y "alpha: 0.6, beta: 0.1"

# Base64 encoded YAML (for complex parameters)
papermill notebook.ipynb output.ipynb -b "YWxwaGE6IDAuNgpiZXRhOiAwLjE="

Path Injection

# Inject input path as parameter
papermill template.ipynb result.ipynb --inject-input-path

# Inject output path as parameter  
papermill template.ipynb result.ipynb --inject-output-path

# Inject both paths
papermill template.ipynb result.ipynb --inject-paths

Execution Options

# Specify kernel
papermill notebook.ipynb output.ipynb --kernel python3

# Use specific engine
papermill notebook.ipynb output.ipynb --engine nbclient

# Set working directory
papermill notebook.ipynb output.ipynb --cwd /data/workspace

# Extended timeout for long-running notebooks
papermill analysis.ipynb results.ipynb --start_timeout 300 --execution-timeout 3600

Output Control

# Hide progress bar
papermill notebook.ipynb output.ipynb --no-progress-bar

# Log output to console
papermill notebook.ipynb output.ipynb --log-output

# Redirect stdout and stderr
papermill notebook.ipynb output.ipynb \
  --stdout-file execution.log \
  --stderr-file errors.log

# Report mode (hide input cells)
papermill analysis.ipynb clean_report.ipynb --report-mode

Validation and Preparation

# Prepare notebook without executing (validation)
papermill template.ipynb prepared.ipynb --prepare-only -p param1 value1

# Get help about notebook parameters
papermill analysis.ipynb --help-notebook

Cloud Storage

# Execute with S3 paths
papermill s3://bucket/input.ipynb s3://bucket/output.ipynb -p dataset "s3://bucket/data.csv"

# Mixed local and cloud
papermill local_template.ipynb s3://bucket/results/output.ipynb

# Azure Blob Storage
papermill abs://account.blob.core.windows.net/container/input.ipynb output.ipynb

# Google Cloud Storage
papermill gs://bucket/notebook.ipynb gs://bucket/result.ipynb

Advanced CLI Usage

Automation Scripts

#!/bin/bash
# Batch processing script

# Define parameters
NOTEBOOKS=(
  "daily_report.ipynb"
  "weekly_analysis.ipynb" 
  "monthly_summary.ipynb"
)

DATE=$(date +%Y-%m-%d)
OUTPUT_DIR="s3://reports-bucket/${DATE}"

# Process each notebook
for notebook in "${NOTEBOOKS[@]}"; do
  echo "Processing ${notebook}..."
  
  papermill "templates/${notebook}" "${OUTPUT_DIR}/${notebook}" \
    -p execution_date "${DATE}" \
    -p output_bucket "reports-bucket" \
    --log-output \
    --report-mode
    
  if [ $? -eq 0 ]; then
    echo "✓ ${notebook} completed successfully"
  else
    echo "✗ ${notebook} failed"
    exit 1
  fi
done

echo "All notebooks processed successfully!"

Parameter File Examples

parameters.yaml:

# Data configuration
data_source: "s3://data-bucket/sales_2024.csv"
output_path: "s3://results-bucket/analysis"

# Analysis parameters
confidence_level: 0.95
sample_size: 1000
remove_outliers: true

# Visualization settings
plot_style: "seaborn"
figure_size: [12, 8]
color_palette: "viridis"

# Complex nested parameters
model_config:
  algorithm: "random_forest"
  hyperparameters:
    n_estimators: 100
    max_depth: 10
    random_state: 42

Using parameter file:

papermill analysis.ipynb results.ipynb -f parameters.yaml

CI/CD Integration

# GitLab CI example
script:
  - pip install papermill[all]
  - papermill notebooks/test_suite.ipynb results/test_results.ipynb
    -p test_environment "ci"
    -p commit_sha "${CI_COMMIT_SHA}"
    --log-output
  - papermill notebooks/performance_test.ipynb results/perf_results.ipynb
    --execution-timeout 1800
    --no-progress-bar

# GitHub Actions example
- name: Execute notebooks
  run: |
    papermill experiments/model_training.ipynb artifacts/training_results.ipynb \
      -p model_version "${GITHUB_SHA:0:8}" \
      -p dataset_version "v2.1" \
      --log-output

Error Handling in Scripts

#!/bin/bash
set -e  # Exit on any error

# Function to handle errors
handle_error() {
  echo "ERROR: Notebook execution failed at line $1"
  echo "Command: $2"
  exit 1
}

trap 'handle_error $LINENO "$BASH_COMMAND"' ERR

# Execute with error handling
echo "Starting notebook execution..."

papermill input.ipynb output.ipynb \
  -p environment "production" \
  -p debug false \
  --log-output \
  --stdout-file execution.log \
  --stderr-file errors.log

echo "Notebook execution completed successfully!"

Utility Functions

Internal CLI utility functions for parameter processing.

def _resolve_type(value: str):
    """
    Resolves string values to appropriate Python types.
    
    Parameters:
    - value: String value to resolve
    
    Returns:
    Appropriately typed value (int, float, bool, str)
    """

def _is_int(value: str) -> bool:
    """
    Checks if string represents an integer.
    
    Parameters:
    - value: String to check
    
    Returns:
    bool: True if string is an integer
    """

def _is_float(value: str) -> bool:
    """
    Checks if string represents a float.
    
    Parameters:
    - value: String to check
    
    Returns:
    bool: True if string is a float
    """

def print_papermill_version(
    ctx: click.Context,
    param: click.Parameter,
    value: bool
) -> None:
    """
    Prints papermill version information and exits.
    
    Parameters:
    - ctx: Click context
    - param: Click parameter
    - value: Whether to print version
    """

Install with Tessl CLI

npx tessl i tessl/pypi-papermill

docs

cli.md

exceptions.md

execution.md

index.md

inspection.md

storage.md

tile.json