or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/spreg@1.8.x
tile.json

tessl/pypi-spreg

tessl install tessl/pypi-spreg@1.8.0

Spatial econometric regression models for analyzing geographically-related data interactions.

Agent Success

Agent success rate when using this tile

87%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.95x

Baseline

Agent success rate without this tile

92%

task.mdevals/scenario-8/

Model Comparison Tool

Build a model comparison tool that fits multiple regression models to housing price data and identifies the best model using information criteria.

Background

You have been provided with a dataset containing housing prices and various features (square footage, number of bedrooms, age of house, distance to city center). Your task is to compare different model specifications and select the best one using statistical information criteria.

Requirements

Create a Python script that:

  1. Loads housing data from a CSV file named housing_data.csv with columns: price, sqft, bedrooms, age, distance
  2. Fits three different regression models:
    • Model 1: price as a function of sqft only
    • Model 2: price as a function of sqft and bedrooms
    • Model 3: price as a function of sqft, bedrooms, age, and distance
  3. Extracts and displays the information criteria values for each model
  4. Determines which model is best according to each criterion (lower values are better)
  5. Outputs a summary table showing model specifications and their criterion values

Input

The input file housing_data.csv will have the following structure:

price,sqft,bedrooms,age,distance
250000,1500,3,10,5.2
320000,1800,4,5,3.1
...

Output

Your program should output:

  1. Information criteria values for each model
  2. A clear indication of which model is selected by each criterion

Example output format:

Model Comparison Results
========================

Model 1: price ~ sqft
  AIC: 2451.23
  BIC: 2458.67

Model 2: price ~ sqft + bedrooms
  AIC: 2398.45
  BIC: 2409.34

Model 3: price ~ sqft + bedrooms + age + distance
  AIC: 2405.78
  BIC: 2423.12

Best Model by AIC: Model 2
Best Model by BIC: Model 2

Implementation

@generates

Tests

  • Given a sample dataset with 100 observations, Model 1 should produce information criteria values @test
  • The program correctly identifies the model with the lowest criterion value as the best model @test
  • When all models have similar complexity, simpler models should be preferred by the criteria @test

API

def load_data(filename: str) -> tuple:
    """
    Load housing data from CSV file.

    Args:
        filename: Path to CSV file

    Returns:
        Tuple of (y, X_dict) where y is the dependent variable and
        X_dict contains different feature combinations
    """
    pass

def fit_models(y, X_dict) -> dict:
    """
    Fit multiple regression models with different specifications.

    Args:
        y: Dependent variable array
        X_dict: Dictionary mapping model names to feature matrices

    Returns:
        Dictionary mapping model names to fitted model objects
    """
    pass

def extract_criteria(models: dict) -> dict:
    """
    Extract information criteria from fitted models.

    Args:
        models: Dictionary of fitted model objects

    Returns:
        Dictionary with model names as keys and dict of criteria as values
        Example: {'Model 1': {'aic': 2451.23, 'bic': 2458.67}, ...}
    """
    pass

def select_best_model(criteria: dict, criterion_name: str) -> str:
    """
    Select the best model based on a specific criterion.

    Args:
        criteria: Dictionary of criteria values per model
        criterion_name: Name of criterion to use ('aic' or 'bic')

    Returns:
        Name of the best model
    """
    pass

def display_results(criteria: dict, best_models: dict) -> None:
    """
    Display model comparison results.

    Args:
        criteria: Dictionary of criteria values per model
        best_models: Dictionary mapping criterion names to best model names
    """
    pass

Dependencies { .dependencies }

spreg { .dependency }

Provides spatial econometric regression capabilities and model fit statistics.

pandas { .dependency }

Provides data loading and manipulation support.

numpy { .dependency }

Provides numerical array operations.