CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-mapie

A scikit-learn-compatible module for estimating prediction intervals using conformal prediction methods.

Overview
Eval results
Files

calibration.mddocs/

Calibration Methods

Probability calibration methods for improving the reliability of probabilistic predictions, particularly for multi-class classification problems. MAPIE provides top-label calibration techniques that ensure predicted probabilities accurately reflect the true likelihood of predictions.

Capabilities

Top-Label Calibrator

Implements top-label calibration for multi-class classification, focusing on calibrating the probability of the most likely predicted class. This approach is particularly effective for scenarios where the confidence in the top prediction is most important.

class TopLabelCalibrator:
    """
    Top-label calibration for multi-class classification.

    Parameters:
    - estimator: ClassifierMixin, base multi-class classifier
    - calibrator: Union[str, RegressorMixin], calibration method ("sigmoid", "isotonic") or custom regressor
    - cv: str, cross-validation strategy ("split", "prefit") (default: "split")
    """
    def __init__(self, estimator=None, calibrator=None, cv="split"): ...

    def fit(self, X, y, sample_weight=None, calib_size=0.33, random_state=None, shuffle=True, stratify=None, **fit_params):
        """
        Fit the classifier and calibrator.

        Parameters:
        - X: ArrayLike, input features
        - y: ArrayLike, class labels
        - sample_weight: Optional[ArrayLike], sample weights
        - calib_size: float, fraction of data for calibration when cv="split" (default: 0.33)
        - random_state: Optional[int], random seed for data splitting
        - shuffle: bool, whether to shuffle data before splitting (default: True)
        - stratify: Optional[ArrayLike], stratification labels (default: None, uses y)
        - **fit_params: additional parameters passed to estimator.fit()

        Returns:
        Self
        """

    def predict_proba(self, X):
        """
        Predict calibrated class probabilities.

        Parameters:
        - X: ArrayLike, test features

        Returns:
        NDArray: calibrated probabilities (shape: n_samples x n_classes)
        """

    def predict(self, X):
        """
        Predict class labels using calibrated probabilities.

        Parameters:
        - X: ArrayLike, test features

        Returns:
        NDArray: predicted class labels
        """

    # Key attributes after fitting
    classes_: NDArray  # Array with class names
    n_classes_: int  # Number of classes
    single_estimator_: ClassifierMixin  # Fitted base classifier
    calibrators: Dict[Union[int, str], RegressorMixin]  # Fitted calibrators per class

Usage Examples

Basic Top-Label Calibration

from mapie.calibration import TopLabelCalibrator
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import numpy as np

# Prepare multi-class data
X, y = load_multiclass_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)

# Create calibrated classifier
calibrated_clf = TopLabelCalibrator(
    estimator=RandomForestClassifier(n_estimators=100, random_state=42),
    calibrator="sigmoid",  # Platt scaling
    cv="split"
)

# Fit with automatic calibration split
calibrated_clf.fit(X_train, y_train, calib_size=0.3, random_state=42)

# Get calibrated probabilities
y_proba_calibrated = calibrated_clf.predict_proba(X_test)
y_pred_calibrated = calibrated_clf.predict(X_test)

Isotonic Calibration

# Use isotonic regression for non-parametric calibration
calibrated_clf = TopLabelCalibrator(
    estimator=LogisticRegression(max_iter=1000),
    calibrator="isotonic",  # Non-parametric calibration
    cv="split"
)

# Fit with stratified splitting
calibrated_clf.fit(
    X_train, y_train,
    calib_size=0.4,
    stratify=y_train,  # Ensure balanced calibration set
    random_state=42
)

# Compare raw vs calibrated probabilities
raw_clf = LogisticRegression(max_iter=1000)
raw_clf.fit(X_train, y_train)

y_proba_raw = raw_clf.predict_proba(X_test)
y_proba_calibrated = calibrated_clf.predict_proba(X_test)

Pre-fitted Estimator Calibration

# Use pre-fitted estimator
base_clf = RandomForestClassifier(n_estimators=50)
base_clf.fit(X_train, y_train)

# Calibrate pre-fitted estimator
calibrated_clf = TopLabelCalibrator(
    estimator=base_clf,
    calibrator="sigmoid",
    cv="prefit"  # Use pre-fitted estimator
)

# Fit calibrator only (estimator already fitted)
calibrated_clf.fit(X_calib, y_calib)  # Separate calibration data

Custom Calibration Regressor

from sklearn.linear_model import Ridge

# Use custom regressor for calibration
custom_calibrator = Ridge(alpha=1.0)

calibrated_clf = TopLabelCalibrator(
    estimator=RandomForestClassifier(),
    calibrator=custom_calibrator,
    cv="split"
)

calibrated_clf.fit(X_train, y_train)

Calibration Methods

Sigmoid Calibration (Platt Scaling)

Fits a sigmoid function to map predicted probabilities to calibrated probabilities. Assumes calibration curve has sigmoid shape.

calibrator="sigmoid"

Mathematical Form:

P_calibrated = 1 / (1 + exp(A * P_raw + B))

Advantages:

  • Parametric method with interpretable parameters
  • Works well when calibration curve is sigmoid-shaped
  • Requires relatively few calibration samples

Best for:

  • Small calibration datasets
  • When the miscalibration follows a sigmoid pattern
  • Naive Bayes and SVM classifiers

Isotonic Calibration

Non-parametric method that fits a monotonic function to map probabilities to calibrated values.

calibrator="isotonic"

Advantages:

  • Non-parametric, no assumptions about calibration curve shape
  • More flexible than sigmoid calibration
  • Can handle complex calibration patterns

Best for:

  • Larger calibration datasets
  • Tree-based models (Random Forest, Gradient Boosting)
  • When calibration curve is not sigmoid-shaped

Advanced Usage

Analyzing Calibration Quality

from sklearn.calibration import calibration_curve
import matplotlib.pyplot as plt

# Compute calibration curves
fraction_pos_raw, mean_pred_raw = calibration_curve(
    y_test, y_proba_raw[:, 1], n_bins=10
)

fraction_pos_cal, mean_pred_cal = calibration_curve(
    y_test, y_proba_calibrated[:, 1], n_bins=10
)

# Plot calibration curves
plt.figure(figsize=(10, 6))
plt.plot([0, 1], [0, 1], 'k--', label='Perfect calibration')
plt.plot(mean_pred_raw, fraction_pos_raw, 's-', label='Raw probabilities')
plt.plot(mean_pred_cal, fraction_pos_cal, 's-', label='Calibrated probabilities')
plt.xlabel('Mean predicted probability')
plt.ylabel('Fraction of positives')
plt.legend()
plt.title('Calibration Plot')
plt.show()

Using Calibration Metrics

from mapie.metrics.calibration import (
    expected_calibration_error,
    top_label_ece,
    kolmogorov_smirnov_statistic
)

# Expected Calibration Error (ECE)
ece_raw = expected_calibration_error(y_test, y_proba_raw[:, 1])
ece_calibrated = expected_calibration_error(y_test, y_proba_calibrated[:, 1])

print(f"ECE before calibration: {ece_raw:.4f}")
print(f"ECE after calibration: {ece_calibrated:.4f}")

# Top-label ECE for multi-class
top_ece_raw = top_label_ece(y_test, y_proba_raw)
top_ece_calibrated = top_label_ece(y_test, y_proba_calibrated)

print(f"Top-label ECE before: {top_ece_raw:.4f}")
print(f"Top-label ECE after: {top_ece_calibrated:.4f}")

# Kolmogorov-Smirnov test
ks_stat_raw = kolmogorov_smirnov_statistic(y_test, y_proba_raw[:, 1])
ks_stat_calibrated = kolmogorov_smirnov_statistic(y_test, y_proba_calibrated[:, 1])

Multi-Class Calibration Analysis

# Analyze per-class calibration
n_classes = len(calibrated_clf.classes_)

plt.figure(figsize=(15, 5))
for i in range(n_classes):
    plt.subplot(1, n_classes, i+1)

    # Binary indicator for class i
    y_binary = (y_test == calibrated_clf.classes_[i]).astype(int)

    # Calibration curve for class i
    fraction_pos, mean_pred = calibration_curve(
        y_binary, y_proba_calibrated[:, i], n_bins=10
    )

    plt.plot([0, 1], [0, 1], 'k--', alpha=0.5)
    plt.plot(mean_pred, fraction_pos, 's-')
    plt.xlabel(f'Mean predicted prob (Class {calibrated_clf.classes_[i]})')
    plt.ylabel('Fraction of positives')
    plt.title(f'Calibration - Class {calibrated_clf.classes_[i]}')

plt.tight_layout()
plt.show()

Sample Weight Support

# Use sample weights during fitting
sample_weights = compute_sample_weight('balanced', y_train)

calibrated_clf = TopLabelCalibrator(
    estimator=RandomForestClassifier(),
    calibrator="sigmoid"
)

# Pass sample weights to fit
calibrated_clf.fit(
    X_train, y_train,
    sample_weight=sample_weights,
    calib_size=0.3
)

Best Practices

Choosing Calibration Method

  • Use sigmoid when:

    • Small calibration dataset (< 1000 samples)
    • Base classifier is Naive Bayes or SVM
    • Calibration curve appears sigmoid-shaped
  • Use isotonic when:

    • Larger calibration dataset (> 1000 samples)
    • Base classifier is tree-based (Random Forest, XGBoost)
    • Calibration curve has complex shape

Calibration Set Size

# Rule of thumb: 20-40% for calibration
calib_sizes = [0.1, 0.2, 0.3, 0.4, 0.5]
ece_scores = []

for calib_size in calib_sizes:
    clf = TopLabelCalibrator(estimator=RandomForestClassifier())
    clf.fit(X_train, y_train, calib_size=calib_size)
    y_proba = clf.predict_proba(X_test)
    ece = expected_calibration_error(y_test, y_proba[:, 1])
    ece_scores.append(ece)

# Find optimal calibration set size
optimal_size = calib_sizes[np.argmin(ece_scores)]
print(f"Optimal calibration size: {optimal_size}")

Cross-Validation for Calibration

from sklearn.model_selection import cross_val_score

# Evaluate calibration with cross-validation
def calibration_score(estimator, X, y):
    """Custom scoring function for calibration quality."""
    y_proba = estimator.predict_proba(X)
    return -expected_calibration_error(y, y_proba[:, 1])  # Negative ECE

calibrated_clf = TopLabelCalibrator(
    estimator=RandomForestClassifier(),
    calibrator="isotonic"
)

scores = cross_val_score(
    calibrated_clf, X, y,
    cv=5,
    scoring=calibration_score
)

print(f"Average calibration score: {np.mean(scores):.4f} ± {np.std(scores):.4f}")

Install with Tessl CLI

npx tessl i tessl/pypi-mapie

docs

calibration.md

classification.md

index.md

metrics.md

regression.md

risk-control.md

utils.md

tile.json