CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-scores

Mathematical functions for verification, evaluation and optimization of forecasts, predictions or models

Overview
Eval results
Files

continuous-scores.mddocs/

Continuous Scores

Metrics for evaluating single-valued continuous forecasts including error measures, bias calculations, efficiency indices, and correlation coefficients. All functions support flexible dimension handling, optional weighting, and angular data processing.

Capabilities

Mean Squared Error (MSE)

Calculates the mean squared error between forecast and observations, the most fundamental error metric in forecast verification.

def mse(
    fcst: FlexibleArrayType,
    obs: FlexibleArrayType,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[xr.DataArray] = None,
    is_angular: Optional[bool] = False,
) -> XarrayLike:
    """
    Calculate Mean Squared Error.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce (collapse)
        preserve_dims: Dimensions to preserve (all others reduced)
        weights: Optional weights for weighted MSE calculation
        is_angular: Handle circular data (e.g. wind direction)

    Returns:
        Mean squared error values

    Formula:
        MSE = (1/n) * Σ(forecast_i - observed_i)²
    """

Usage Example:

from scores.continuous import mse
import xarray as xr

# Basic MSE calculation
mse_value = mse(forecast, observations)

# MSE with dimension reduction
temporal_mse = mse(forecast, observations, reduce_dims="time")

# Weighted MSE (e.g. area weighting)
weighted_mse = mse(forecast, observations, weights=area_weights)

# Angular MSE for directional data
wind_dir_mse = mse(forecast_dir, observed_dir, is_angular=True)

Root Mean Squared Error (RMSE)

Square root of MSE, providing error values in the same units as the original data.

def rmse(
    fcst: FlexibleArrayType,
    obs: FlexibleArrayType,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[xr.DataArray] = None,
    is_angular: Optional[bool] = False,
) -> FlexibleArrayType:
    """
    Calculate Root Mean Squared Error.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights
        is_angular: Handle circular data

    Returns:
        Root mean squared error values

    Formula:
        RMSE = √[(1/n) * Σ(forecast_i - observed_i)²]
    """

Mean Absolute Error (MAE)

Mean absolute deviation between forecast and observations, less sensitive to outliers than MSE.

def mae(
    fcst: FlexibleArrayType,
    obs: FlexibleArrayType,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[xr.DataArray] = None,
    is_angular: Optional[bool] = False,
) -> FlexibleArrayType:
    """
    Calculate Mean Absolute Error.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights
        is_angular: Handle circular data

    Returns:
        Mean absolute error values

    Formula:
        MAE = (1/n) * Σ|forecast_i - observed_i|
    """

Bias Metrics

Additive Bias (Mean Error)

Average difference between forecast and observations, indicating systematic over- or under-prediction.

def additive_bias(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[XarrayLike] = None,
    is_angular: Optional[bool] = False,
) -> XarrayLike:
    """
    Calculate additive bias (mean error).

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights
        is_angular: Handle circular data

    Returns:
        Additive bias values

    Formula:
        Bias = (1/n) * Σ(forecast_i - observed_i)
    """

# Alias for additive_bias
def mean_error(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[XarrayLike] = None,
    is_angular: Optional[bool] = False,
) -> XarrayLike:
    """Alias for additive_bias."""

Multiplicative Bias

Ratio-based bias measure useful for positive-valued variables.

def multiplicative_bias(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[XarrayLike] = None,
) -> XarrayLike:
    """
    Calculate multiplicative bias.

    Args:
        fcst: Forecast data (positive values)
        obs: Observation data (positive values)
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights

    Returns:
        Multiplicative bias values

    Formula:
        Multiplicative Bias = mean(forecast) / mean(observation)

    Notes:
        - Values > 1 indicate forecast over-prediction
        - Values < 1 indicate forecast under-prediction
        - Perfect forecast has multiplicative bias = 1
    """

Percent Bias (PBIAS)

Percentage-based bias calculation commonly used in hydrology.

def pbias(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[XarrayLike] = None,
) -> XarrayLike:
    """
    Calculate percent bias.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights

    Returns:
        Percent bias values

    Formula:
        PBIAS = 100 * Σ(forecast_i - observed_i) / Σ(observed_i)

    Notes:
        - Positive values indicate model overestimation bias
        - Negative values indicate model underestimation bias
        - Perfect forecast has PBIAS = 0
    """

Efficiency Indices

Kling-Gupta Efficiency (KGE)

Decomposed performance metric combining correlation, variability ratio, and bias ratio.

def kge(
    fcst: xr.DataArray,
    obs: xr.DataArray,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    scaling_factors: Optional[Union[list[float], np.ndarray]] = None,
    include_components: Optional[bool] = False,
) -> XarrayLike:
    """
    Calculate Kling-Gupta Efficiency.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        scaling_factors: Component scaling factors [ρ, α, β]
        include_components: Return individual components

    Returns:
        KGE efficiency values (and components if requested)

    Formula:
        KGE = 1 - √[(s_ρ*(ρ-1))² + (s_α*(α-1))² + (s_β*(β-1))²]

    Components:
        - ρ: Correlation coefficient
        - α: Variability ratio (σ_forecast/σ_observation)
        - β: Bias ratio (μ_forecast/μ_observation)
        - s_ρ, s_α, s_β: Scaling factors (default: [1,1,1])

    Notes:
        - Perfect forecast has KGE = 1
        - Default benchmark (observation mean) has KGE ≈ -0.41
    """

Nash-Sutcliffe Efficiency (NSE)

Normalized measure of model performance relative to observation mean.

def nse(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[XarrayLike] = None,
    is_angular: Optional[bool] = False,
) -> XarrayLike:
    """
    Calculate Nash-Sutcliffe Efficiency.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights
        is_angular: Handle circular data

    Returns:
        Nash-Sutcliffe efficiency values

    Formula:
        NSE = 1 - Σ(observed_i - forecast_i)² / Σ(observed_i - observed_mean)²

    Notes:
        - Perfect forecast has NSE = 1
        - Forecast as good as observation mean has NSE = 0
        - NSE can be negative (worse than observation mean)
        - Range: (-∞, 1]
    """

Correlation Coefficients

Pearson Correlation

Linear correlation coefficient measuring strength of linear relationship.

def pearsonr(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
) -> XarrayLike:
    """
    Calculate Pearson correlation coefficient.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve

    Returns:
        Pearson correlation coefficients

    Notes:
        - Range: [-1, 1]
        - Measures linear association strength
        - Perfect positive correlation: r = 1
        - Perfect negative correlation: r = -1
        - No linear relationship: r = 0
        - This function doesn't support weights
    """

Spearman Rank Correlation

Non-parametric correlation based on rank ordering.

def spearmanr(
    fcst: XarrayLike,
    obs: XarrayLike,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
) -> XarrayLike:
    """
    Calculate Spearman rank correlation coefficient.

    Args:
        fcst: Forecast data
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve

    Returns:
        Spearman rank correlation coefficients

    Notes:
        - Range: [-1, 1]
        - Measures monotonic association strength
        - Less sensitive to outliers than Pearson
        - This function doesn't support weights
        - Based on rank ordering rather than actual values
    """

Advanced Scoring Functions

Quantile Score

Asymmetric loss function for evaluating quantile forecasts.

def quantile_score(
    fcst: FlexibleArrayType,
    obs: FlexibleArrayType,
    alpha: float,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
    weights: Optional[xr.DataArray] = None,
) -> FlexibleArrayType:
    """
    Calculate quantile score for quantile forecasts.

    Args:
        fcst: Quantile forecast values
        obs: Observation values
        alpha: Quantile level (e.g., 0.1 for 10th percentile)
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve
        weights: Optional weights

    Returns:
        Quantile scores

    Formula:
        QS = Σ[(α - I(obs_i < fcst_i)) * (fcst_i - obs_i)]

    Notes:
        - Proper scoring rule for quantile forecasts
        - Alpha should match the quantile level being forecast
        - Lower scores indicate better performance
    """

Flip-Flop Index

Measures forecast consistency by detecting oscillatory behavior.

def flip_flop_index(
    fcst: xr.DataArray,
    obs: xr.DataArray,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
) -> xr.DataArray:
    """
    Calculate Flip-Flop Index for forecast consistency.

    Args:
        fcst: Forecast data (must have time dimension)
        obs: Observation data
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve

    Returns:
        Flip-flop index values

    Notes:
        - Measures forecast oscillatory behavior
        - Higher values indicate more inconsistent forecasts
        - Requires time series data
        - Used to detect forecast instability
    """

def flip_flop_index_proportion_exceeding(
    fcst: xr.DataArray,
    obs: xr.DataArray,
    threshold: float,
    *,
    reduce_dims: Optional[FlexibleDimensionTypes] = None,
    preserve_dims: Optional[FlexibleDimensionTypes] = None,
) -> xr.DataArray:
    """
    Calculate proportion of flip-flop indices exceeding threshold.

    Args:
        fcst: Forecast data
        obs: Observation data
        threshold: Threshold value for comparison
        reduce_dims: Dimensions to reduce
        preserve_dims: Dimensions to preserve

    Returns:
        Proportion values [0, 1]
    """

Usage Patterns

Standard Workflow

from scores.continuous import mse, rmse, mae, kge, pearsonr
from scores.sample_data import continuous_forecast, continuous_observations

# Load data
forecast = continuous_forecast()
observations = continuous_observations()

# Calculate multiple metrics
scores_dict = {
    'MSE': mse(forecast, observations),
    'RMSE': rmse(forecast, observations),
    'MAE': mae(forecast, observations),
    'KGE': kge(forecast, observations),
    'Correlation': pearsonr(forecast, observations)
}

# Display results
for metric, value in scores_dict.items():
    print(f"{metric}: {value.values:.4f}")

Dimension-aware Scoring

# Multi-dimensional data (time, lat, lon)
forecast_3d = forecast.expand_dims({"lat": 10, "lon": 15})
observations_3d = observations.expand_dims({"lat": 10, "lon": 15})

# Compute temporal scores at each grid point
spatial_temporal_mse = mse(forecast_3d, observations_3d, reduce_dims="time")

# Compute spatial scores for each time step
temporal_spatial_mse = mse(forecast_3d, observations_3d, reduce_dims=["lat", "lon"])

# Overall score (all dimensions)
overall_mse = mse(forecast_3d, observations_3d)

Weighted Scoring

from scores.functions import create_latitude_weights
import numpy as np

# Create latitude weights for area weighting
lats = np.linspace(-90, 90, forecast_3d.sizes["lat"])
area_weights = create_latitude_weights(lats)

# Calculate area-weighted scores
weighted_mse = mse(forecast_3d, observations_3d,
                   reduce_dims=["time", "lat", "lon"],
                   weights=area_weights)

Install with Tessl CLI

npx tessl i tessl/pypi-scores

docs

categorical-scores.md

continuous-scores.md

emerging-scores.md

index.md

pandas-integration.md

plot-data.md

probability-scores.md

processing-tools.md

sample-data.md

spatial-scores.md

statistical-tests.md

tile.json