tessl/pypi-shap

A unified approach to explain the output of any machine learning model.

—

Pending

Overview

Eval results

Files

Visualization and Plotting

Name: tessl/pypi-shap
Author: tessl

SHAP provides comprehensive visualization functions for understanding and communicating model explanations. All plotting functions work seamlessly with Explanation objects and support both interactive and static output formats.

Capabilities

Summary Visualizations

High-level overview plots showing feature importance and explanation patterns across datasets.

def bar(shap_values, max_display=10, order=Explanation.abs, clustering=None, 
        clustering_cutoff=0.5, show_data="auto", ax=None, show=True):
    """
    Bar plot showing feature importance with SHAP values.
    
    Parameters:
    - shap_values: Explanation object, Cohorts, or dictionary of explanations
    - max_display: Number of top features to display (default: 10)
    - order: Function for sorting features (default: absolute values)
    - clustering: Partition tree for hierarchical clustering
    - show_data: Whether to show feature values in y-axis labels ("auto", True, False)
    - ax: Matplotlib axes object (optional)
    - show: Display plot immediately (bool)
    """

def beeswarm(shap_values, max_display=10, order=Explanation.abs.mean(0), 
             clustering=None, cluster_threshold=0.5, color=None, alpha=1.0, 
             ax=None, show=True, log_scale=False, s=16, plot_size="auto"):
    """
    Beeswarm plot showing distribution of feature impacts across samples.
    
    Each dot represents one sample's SHAP value for a feature, with position
    indicating impact magnitude and color showing feature value.
    
    Parameters:
    - shap_values: Explanation object with matrix of SHAP values
    - color: Colormap or color scheme for points
    - s: Size of scatter plot markers (default: 16)
    - plot_size: Control plot dimensions ("auto", float, or (width, height))
    - log_scale: Use logarithmic scale for SHAP values (bool)
    """

def violin(shap_values, features=None, feature_names=None, max_display=None,
           plot_type="violin", color=None, show=True, sort=True, color_bar=True,
           plot_size="auto", cmap=None):
    """
    Violin plot summary of SHAP value distributions across dataset.
    
    Shows density distribution of SHAP values for each feature,
    revealing patterns in feature importance across samples.
    
    Parameters:
    - plot_type: Type of plot ("violin", "layered_violin", "compact_dot")
    - sort: Sort features by importance (bool)
    - color_bar: Show color bar for feature values (bool)
    """

Usage Example:

import shap

# Create explanations
explainer = shap.TreeExplainer(model)
shap_values = explainer(X)

# Summary visualizations
shap.plots.bar(shap_values)                    # Feature importance bar chart
shap.plots.beeswarm(shap_values)              # Feature impact distribution  
shap.plots.violin(shap_values, max_display=8) # Density distributions

Individual Prediction Analysis

Detailed visualizations for understanding specific predictions and feature contributions.

def waterfall(shap_values, max_display=10, show=True):
    """
    Waterfall chart showing additive feature contributions for single prediction.
    
    Shows how each feature pushes the prediction above or below the base value,
    with cumulative effect building to final prediction.
    
    Parameters:
    - shap_values: Single-row Explanation object only
    - max_display: Maximum features to show (default: 10)
    - show: Display plot immediately (bool)
    
    Note: Only accepts single prediction (one row of SHAP values)
    """

def force(base_value, shap_values=None, features=None, feature_names=None, 
          out_names=None, link="identity", plot_cmap="RdBu", matplotlib=False,
          show=True, figsize=(20, 3), contribution_threshold=0.05):
    """
    Interactive additive force layout visualization.
    
    Shows how features push prediction above/below baseline with
    proportional visual representation. Supports interactive JavaScript
    or static matplotlib output.
    
    Parameters:
    - base_value: Reference value or Explanation object
    - shap_values: SHAP values (optional if base_value is Explanation)
    - features: Feature values for display
    - link: Output transformation ("identity" or "logit")
    - matplotlib: Use matplotlib instead of JavaScript (default: False)
    - contribution_threshold: Threshold for displaying feature names (0-1)
    - figsize: Figure size for matplotlib output
    """

def decision(base_value, shap_values, features=None, feature_names=None,
             feature_order=None, feature_display_range=None, highlight=None,
             link=None, plot_color=None, axis_color="#333333", show=True):
    """
    Decision plot showing cumulative SHAP values.
    
    Traces path from base value to final prediction through
    cumulative feature contributions, useful for understanding
    prediction reasoning.
    
    Parameters:
    - feature_order: Order for displaying features
    - feature_display_range: Range of features to display
    - highlight: Highlight specific samples or features
    - plot_color: Colors for prediction paths
    """

Feature Interaction Analysis

Visualizations for understanding feature relationships and dependencies.

def scatter(shap_values, color="#1E88E5", hist=True, axis_color="#333333",
            cmap=None, dot_size=16, x_jitter="auto", alpha=1.0,
            title=None, xmin=None, xmax=None, ymin=None, ymax=None,
            overlay=None, ax=None, ylabel="SHAP value", show=True):
    """
    SHAP dependence scatter plot showing feature interactions.
    
    Plots SHAP values against feature values to show how feature
    impacts change across different feature values. Color can represent
    interaction effects with other features.
    
    Parameters:
    - color: Fixed color or Explanation object for coloring points
    - hist: Show histogram along x-axis (default: True)
    - x_jitter: Add random jitter for discrete features ("auto" or float)
    - dot_size: Size of scatter points
    - overlay: Additional data to overlay on plot
    """

def partial_dependence(ind, model, data, xmin="percentile(0)", 
                       xmax="percentile(100)", npoints=None, ice=True,
                       model_expected_value=False, feature_expected_value=False,
                       shap_values=None, ax=None, show=True):
    """
    Partial dependence plot with Individual Conditional Expectation (ICE) curves.
    
    Shows how model output changes as target feature varies, with
    individual sample trajectories and average trend.
    
    Parameters:
    - ind: Feature index or name to analyze
    - model: Model function for predictions
    - data: Background dataset for partial dependence
    - ice: Show individual conditional expectation curves (bool)
    - npoints: Number of points along feature range
    - model_expected_value: Show model's expected output
    - feature_expected_value: Show feature's expected value
    """

def heatmap(shap_values, instance_order=Explanation.hclust(), 
            feature_values=Explanation.abs.mean(0), feature_order=None,
            max_display=10, cmap=None, show=True, plot_width=8, ax=None):
    """
    Heatmap showing explanation patterns across instances and features.
    
    Uses supervised clustering to reveal population substructure
    and feature importance patterns.
    
    Parameters:
    - instance_order: Function for sample ordering (default: hierarchical clustering)
    - feature_values: Global summary values for features  
    - feature_order: Custom feature ordering (optional)
    - cmap: Colormap for heatmap
    - plot_width: Width of plot in inches
    """

Specialized Data Type Visualizations

Visualizations optimized for specific input types like images, text, and embeddings.

def image(shap_values, pixel_values=None, labels=None, true_labels=None,
          width=20, aspect=0.2, hspace=0.2, cmap=None, vmax=None, show=True):
    """
    Visualize SHAP values for image inputs with pixel-level attributions.
    
    Overlays SHAP importance on original images using color intensity
    to show which pixels contribute most to predictions.
    
    Parameters:
    - pixel_values: Original image pixel values
    - labels: Predicted class labels
    - true_labels: Ground truth labels (optional)
    - width: Width of visualization in inches
    - aspect: Aspect ratio for subplots
    - hspace: Vertical spacing between subplots
    - vmax: Maximum value for color scale normalization
    """

def text(shap_values, num_starting_labels=0, grouping_threshold=0.01,
         separator="", xmin=None, xmax=None, cmax=None, display=True):
    """
    Interactive text explanations with token-level coloring.
    
    Colors text tokens based on SHAP importance, with intensity
    indicating contribution magnitude and color indicating direction.
    
    Parameters:
    - num_starting_labels: Number of output labels to show initially
    - grouping_threshold: Threshold for grouping adjacent tokens
    - separator: Separator between tokens in display
    - cmax: Maximum color intensity
    - display: Show interactive HTML output (bool)
    """

def embedding(ind, shap_values, feature_names=None, method="pca", 
              alpha=1.0, show=True):
    """
    2D embedding visualization of SHAP values.
    
    Projects high-dimensional SHAP values to 2D space for
    visualization of explanation patterns and clustering.
    
    Parameters:
    - ind: Feature indices to include in embedding
    - method: Dimensionality reduction method ("pca", "tsne")
    - alpha: Transparency of points (0-1)
    """

Monitoring and Comparison

Visualizations for model monitoring and comparative analysis across groups or time periods.

def monitoring(ind, shap_values, features, feature_names=None, 
               show=True):
    """
    Monitor model behavior over time or across batches.
    
    Tracks how feature importance and prediction patterns
    change across different data slices or time periods.
    
    Parameters:
    - ind: Time or batch indices for x-axis
    - features: Feature values for trend analysis
    """

def group_difference(shap_values, group_mask, feature_names=None, 
                     show=True):
    """
    Compare SHAP values between different groups or populations.
    
    Shows difference in feature importance patterns between
    demographic groups, time periods, or other categorical divisions.
    
    Parameters:
    - group_mask: Boolean mask defining group membership
    - feature_names: Names for features in comparison
    """

def benchmark(benchmark_result, show=True):
    """
    Visualize benchmark results comparing different explanation methods.
    
    Shows performance metrics and quality comparisons across
    different explainer algorithms.
    
    Parameters:
    - benchmark_result: BenchmarkResult object with comparison data
    """

JavaScript Integration and Export

Functions for web integration and interactive visualization export.

def initjs():
    """Initialize JavaScript environment for interactive plots in Jupyter notebooks."""

def getjs():
    """Get JavaScript code for SHAP visualizations (for web embedding)."""

def save_html(out_file, plot_html):
    """
    Save interactive plot HTML to file.
    
    Parameters:  
    - out_file: Output file path
    - plot_html: HTML content from interactive plot
    """

Legacy Plotting Functions

SHAP maintains backward compatibility with legacy plotting function names:

# Legacy names (still supported)
def bar_plot(*args, **kwargs): ...      # Use shap.plots.bar
def summary_plot(*args, **kwargs): ...  # Use shap.plots.beeswarm  
def dependence_plot(*args, **kwargs): ... # Use shap.plots.scatter
def force_plot(*args, **kwargs): ...    # Use shap.plots.force
def waterfall_plot(*args, **kwargs): ... # Use shap.plots.waterfall
# ... and others

Usage Patterns

Basic Visualization Workflow

import shap

# Generate explanations
explainer = shap.TreeExplainer(model)
shap_values = explainer(X)

# Overview visualizations
shap.plots.bar(shap_values, max_display=10)       # Top features
shap.plots.beeswarm(shap_values, max_display=15)  # Feature distributions

# Individual prediction analysis  
shap.plots.waterfall(shap_values[0])              # Single prediction
shap.plots.force(shap_values[0])                  # Interactive force plot

# Feature interactions
shap.plots.scatter(shap_values[:, "feature_name"]) # Dependence plot
shap.plots.heatmap(shap_values[:100])             # Pattern clustering

Customization and Styling

All plotting functions support matplotlib customization:

import matplotlib.pyplot as plt

# Custom styling
plt.style.use('seaborn-v0_8')
fig, ax = plt.subplots(figsize=(12, 8))

# Plot with custom axes
shap.plots.bar(shap_values, ax=ax, show=False)
ax.set_title("Custom SHAP Feature Importance")
plt.tight_layout()
plt.show()

Interactive vs Static Output

Control output format based on environment:

# Interactive (Jupyter notebook)
shap.plots.force(shap_values[0])  # JavaScript interactive

# Static (matplotlib) 
shap.plots.force(shap_values[0], matplotlib=True)  # PNG/SVG output

# For publication
shap.plots.waterfall(shap_values[0], show=False)
plt.savefig('explanation.pdf', dpi=300, bbox_inches='tight')

Error Handling

Common visualization errors and solutions:

EmptyDataError: No SHAP values provided or empty Explanation object
DimensionError: Incompatible shapes between SHAP values and feature data
ImportError: Missing matplotlib for static plots or JavaScript dependencies
TypeError: Incorrect data types (e.g., multi-output data to single-output plot)

Install with Tessl CLI