tessl/pypi-pymc3

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano

Agent Success

Agent success rate when using this tile

68%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.94x

Baseline

Agent success rate without this tile

72%

Overview

Eval results

Files

Prior Predictive and Posterior Forecasting for Daily Counts

Name: tessl/pypi-pymc3
Author: tessl

Design a module that models daily count data with a single latent rate and exposes helpers to compare prior and posterior predictive simulations for both observed and held-out days.

Capabilities

Build model for count data

Creates a probabilistic model with one latent log-rate parameter (log_rate) and a positive rate transform, binds observed counts (observed_counts) to a named day dimension matching the provided sequence length, and keeps the likelihood in the count family. @test

Prior predictive simulation

With a fixed random seed and 200 draws, prior predictive simulation returns draws for both log_rate and observed_counts, each shaped (200, num_days), reproducible when reseeded, and keyed exactly by those variable names. @test

Posterior predictive with held-out days

After fitting to a 7-day observed series, posterior predictive simulation can generate draws for both the original days and two held-out future days, returning arrays keyed as observed_counts with shape (posterior_draws, 7) and forecast_counts with shape (posterior_draws, 2) respectively. @test

Workflow comparison summary

A helper computes the median of prior predictive counts and posterior predictive counts for the observed days; for the dataset [3, 4, 0, 2, 1, 5, 3] with seeds fixed, the posterior median exceeds the prior median and both values are returned as a tuple. @test

Implementation

@generates

API

from typing import Sequence, Any

def build_hit_model(counts: Sequence[int]) -> Any:
    """
    Create a probabilistic model for daily count data with a single latent rate parameter
    and observed counts bound to a named dimension. Latent and observed variables should be
    named `log_rate` and `observed_counts`.
    """

def simulate_prior(model: Any, draws: int = 500, seed: int | None = None) -> dict[str, Any]:
    """
    Run prior predictive simulation for `log_rate` and `observed_counts`.
    Returns a mapping keyed by those variable names; sampled arrays must include the observation dimension.
    """

def fit_posterior(model: Any, draws: int = 1000, tune: int = 1000, seed: int | None = None) -> Any:
    """
    Run posterior inference for the model and return the trace/result object required by the library's posterior predictive routine.
    """

def simulate_posterior(model: Any, posterior: Any, new_days: int | None = None, seed: int | None = None) -> dict[str, Any]:
    """
    Run posterior predictive simulation using the provided posterior samples, optionally forecasting `new_days`
    additional observations with the same likelihood and a shared rate parameter. Outputs must be keyed as
    `observed_counts` for in-sample predictions and `forecast_counts` when forecasting.
    """

def compare_medians(prior_draws: dict[str, Any], posterior_draws: dict[str, Any]) -> tuple[float, float]:
    """
    Return (prior_median, posterior_median) for the observation variable based on provided draws.
    """

Dependencies { .dependencies }

pymc { .dependency }

Probabilistic programming library used for model construction, inference, and predictive simulation.