Parameterize and run Jupyter and nteract Notebooks
—
Comprehensive CLI for batch notebook execution, parameter passing, and automation workflows with support for YAML configuration files and various parameter input formats. The papermill command provides a powerful interface for integrating notebooks into data pipelines and automation systems.
Primary CLI entry point for executing notebooks with parameters.
papermill [OPTIONS] NOTEBOOK_PATH [OUTPUT_PATH]
Options:
--help-notebook Display parameters information for the given notebook path
-p, --parameters TEXT Parameters to pass to the parameters cell (key value pairs)
-r, --parameters_raw TEXT Parameters to be read as raw string (key value pairs)
-f, --parameters_file TEXT Path to YAML file containing parameters
-y, --parameters_yaml TEXT YAML string to be used as parameters
-b, --parameters_base64 TEXT Base64 encoded YAML string as parameters
--inject-input-path Insert the path of the input notebook as PAPERMILL_INPUT_PATH as a notebook parameter
--inject-output-path Insert the path of the output notebook as PAPERMILL_OUTPUT_PATH as a notebook parameter
--inject-paths Insert the paths of input/output notebooks as PAPERMILL_INPUT_PATH/PAPERMILL_OUTPUT_PATH as notebook parameters
--engine TEXT The execution engine name to use in evaluating the notebook
--request-save-on-cell-execute / --no-request-save-on-cell-execute Request save notebook after each cell execution
--autosave-cell-every INTEGER How often in seconds to autosave the notebook during long cell executions (0 to disable)
--prepare-only / --prepare-execute Flag for outputting the notebook without execution, but with parameters applied
--kernel, -k TEXT Name of kernel to run. Ignores kernel name in the notebook document metadata
--language, -l TEXT Language for notebook execution. Ignores language in the notebook document metadata
--cwd TEXT Working directory to run notebook in
--progress-bar / --no-progress-bar Flag for turning on the progress bar
--log-output / --no-log-output Flag for writing notebook output to the configured logger
--stdout-file FILE File to write notebook stdout output to
--stderr-file FILE File to write notebook stderr output to
--log-level [NOTSET|DEBUG|INFO|WARNING|ERROR|CRITICAL] Set log level
--start-timeout INTEGER Time in seconds to wait for kernel to start
--execution-timeout INTEGER Time in seconds to wait for each cell before failing execution (default: forever)
--report-mode / --no-report-mode Flag for hiding input
--version Show version and exit
-h, --help Show this message and exitPython interface to the CLI command.
def papermill(
ctx: click.Context,
notebook_path: str,
output_path: str = "",
help_notebook: bool = False,
parameters: tuple = (),
parameters_raw: tuple = (),
parameters_file: tuple = (),
parameters_yaml: tuple = (),
parameters_base64: tuple = (),
inject_input_path: bool = False,
inject_output_path: bool = False,
inject_paths: bool = False,
engine: str = None,
prepare_only: bool = False,
kernel: str = None,
language: str = None,
log_output: bool = False,
stdout_file = None,
stderr_file = None,
no_progress_bar: bool = False,
autosave_cell_every: int = None,
start_timeout: int = 60,
execution_timeout: int = None,
report_mode: bool = False,
cwd: str = None,
version: bool = False
) -> None:
"""
Main CLI interface for papermill notebook execution.
This is the Click command function that handles all CLI operations
including parameter parsing, validation, and execution coordination.
"""# Execute notebook with output
papermill input.ipynb output.ipynb
# Execute without saving output
papermill input.ipynb# Single parameters
papermill analysis.ipynb results.ipynb -p alpha 0.6 -p iterations 100
# Raw string parameters (no type conversion)
papermill notebook.ipynb output.ipynb -r config_file "/path/with spaces/config.json"
# Multiple parameter methods
papermill notebook.ipynb output.ipynb \
-p threshold 0.8 \
-r data_path "/complex path/data.csv" \
-f config.yaml# YAML parameter file
papermill experiment.ipynb result.ipynb -f parameters.yaml
# Inline YAML parameters
papermill notebook.ipynb output.ipynb -y "alpha: 0.6, beta: 0.1"
# Base64 encoded YAML (for complex parameters)
papermill notebook.ipynb output.ipynb -b "YWxwaGE6IDAuNgpiZXRhOiAwLjE="# Inject input path as parameter
papermill template.ipynb result.ipynb --inject-input-path
# Inject output path as parameter
papermill template.ipynb result.ipynb --inject-output-path
# Inject both paths
papermill template.ipynb result.ipynb --inject-paths# Specify kernel
papermill notebook.ipynb output.ipynb --kernel python3
# Use specific engine
papermill notebook.ipynb output.ipynb --engine nbclient
# Set working directory
papermill notebook.ipynb output.ipynb --cwd /data/workspace
# Extended timeout for long-running notebooks
papermill analysis.ipynb results.ipynb --start_timeout 300 --execution-timeout 3600# Hide progress bar
papermill notebook.ipynb output.ipynb --no-progress-bar
# Log output to console
papermill notebook.ipynb output.ipynb --log-output
# Redirect stdout and stderr
papermill notebook.ipynb output.ipynb \
--stdout-file execution.log \
--stderr-file errors.log
# Report mode (hide input cells)
papermill analysis.ipynb clean_report.ipynb --report-mode# Prepare notebook without executing (validation)
papermill template.ipynb prepared.ipynb --prepare-only -p param1 value1
# Get help about notebook parameters
papermill analysis.ipynb --help-notebook# Execute with S3 paths
papermill s3://bucket/input.ipynb s3://bucket/output.ipynb -p dataset "s3://bucket/data.csv"
# Mixed local and cloud
papermill local_template.ipynb s3://bucket/results/output.ipynb
# Azure Blob Storage
papermill abs://account.blob.core.windows.net/container/input.ipynb output.ipynb
# Google Cloud Storage
papermill gs://bucket/notebook.ipynb gs://bucket/result.ipynb#!/bin/bash
# Batch processing script
# Define parameters
NOTEBOOKS=(
"daily_report.ipynb"
"weekly_analysis.ipynb"
"monthly_summary.ipynb"
)
DATE=$(date +%Y-%m-%d)
OUTPUT_DIR="s3://reports-bucket/${DATE}"
# Process each notebook
for notebook in "${NOTEBOOKS[@]}"; do
echo "Processing ${notebook}..."
papermill "templates/${notebook}" "${OUTPUT_DIR}/${notebook}" \
-p execution_date "${DATE}" \
-p output_bucket "reports-bucket" \
--log-output \
--report-mode
if [ $? -eq 0 ]; then
echo "✓ ${notebook} completed successfully"
else
echo "✗ ${notebook} failed"
exit 1
fi
done
echo "All notebooks processed successfully!"parameters.yaml:
# Data configuration
data_source: "s3://data-bucket/sales_2024.csv"
output_path: "s3://results-bucket/analysis"
# Analysis parameters
confidence_level: 0.95
sample_size: 1000
remove_outliers: true
# Visualization settings
plot_style: "seaborn"
figure_size: [12, 8]
color_palette: "viridis"
# Complex nested parameters
model_config:
algorithm: "random_forest"
hyperparameters:
n_estimators: 100
max_depth: 10
random_state: 42Using parameter file:
papermill analysis.ipynb results.ipynb -f parameters.yaml# GitLab CI example
script:
- pip install papermill[all]
- papermill notebooks/test_suite.ipynb results/test_results.ipynb
-p test_environment "ci"
-p commit_sha "${CI_COMMIT_SHA}"
--log-output
- papermill notebooks/performance_test.ipynb results/perf_results.ipynb
--execution-timeout 1800
--no-progress-bar
# GitHub Actions example
- name: Execute notebooks
run: |
papermill experiments/model_training.ipynb artifacts/training_results.ipynb \
-p model_version "${GITHUB_SHA:0:8}" \
-p dataset_version "v2.1" \
--log-output#!/bin/bash
set -e # Exit on any error
# Function to handle errors
handle_error() {
echo "ERROR: Notebook execution failed at line $1"
echo "Command: $2"
exit 1
}
trap 'handle_error $LINENO "$BASH_COMMAND"' ERR
# Execute with error handling
echo "Starting notebook execution..."
papermill input.ipynb output.ipynb \
-p environment "production" \
-p debug false \
--log-output \
--stdout-file execution.log \
--stderr-file errors.log
echo "Notebook execution completed successfully!"Internal CLI utility functions for parameter processing.
def _resolve_type(value: str):
"""
Resolves string values to appropriate Python types.
Parameters:
- value: String value to resolve
Returns:
Appropriately typed value (int, float, bool, str)
"""
def _is_int(value: str) -> bool:
"""
Checks if string represents an integer.
Parameters:
- value: String to check
Returns:
bool: True if string is an integer
"""
def _is_float(value: str) -> bool:
"""
Checks if string represents a float.
Parameters:
- value: String to check
Returns:
bool: True if string is a float
"""
def print_papermill_version(
ctx: click.Context,
param: click.Parameter,
value: bool
) -> None:
"""
Prints papermill version information and exits.
Parameters:
- ctx: Click context
- param: Click parameter
- value: Whether to print version
"""Install with Tessl CLI
npx tessl i tessl/pypi-papermill