CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pysd

System Dynamics modeling library for Python that integrates with data science tools

Pending
Overview
Eval results
Files

external-data.mddocs/

External Data Integration

PySD's external data system enables models to access time series data, lookup tables, constants, and subscripts from external files, supporting various formats including Excel, CSV, and netCDF with automatic caching and encoding handling.

Capabilities

Base External Data Class

Foundation class for all external data components with common functionality for file handling and data management.

class External:
    """
    Base class for external data objects.
    
    Provides common functionality for loading, caching, and accessing
    external data sources. Handles file path resolution, encoding detection,
    and error management.
    
    Methods:
    - __init__(file_name, root, sheet=None, time_row_or_col=None, cell=None)
    - initialize() - Load and prepare external data
    - __call__(time) - Get data value at specified time
    """

Time Series Data

Handle time-varying data from external files with interpolation and extrapolation capabilities.

class ExtData(External):
    """
    Time series data from external files.
    
    Loads time series data from CSV, Excel, or other supported formats.
    Supports interpolation, extrapolation, and missing value handling.
    
    Parameters:
    - file_name: str - Path to data file
    - root: str - Root directory for relative paths
    - sheet: str or int or None - Excel sheet name/index
    - time_row_or_col: str or int - Time column/row identifier
    - cell: str or tuple - Specific cell range for data
    - interp: str - Interpolation method ('linear', 'nearest', 'cubic')
    - py_name: str - Python variable name
    
    Methods:
    - __call__(time) - Get interpolated value at specified time
    - get_series_data() - Get original pandas Series
    """

Usage Examples

from pysd.py_backend.external import ExtData

# Load time series from CSV
population_data = ExtData(
    file_name='demographics.csv',
    root='/data',
    time_row_or_col='year',
    py_name='historical_population'
)

# Load from Excel with specific sheet
economic_data = ExtData(
    file_name='economic_indicators.xlsx', 
    root='/data',
    sheet='GDP_Data',
    time_row_or_col='time',
    interp='linear'
)

# Access data during simulation
pop_at_time_15 = population_data(15.0)
gdp_at_time_20 = economic_data(20.0)

# Get original data series
original_pop_data = population_data.get_series_data()

Lookup Tables

Access lookup tables and reference data from external files with support for multi-dimensional lookups.

class ExtLookup(External):
    """
    Lookup tables from external files.
    
    Loads lookup tables for interpolation-based relationships between variables.
    Supports 1D and multi-dimensional lookups with various interpolation methods.
    
    Parameters:
    - file_name: str - Path to lookup file
    - root: str - Root directory
    - sheet: str or int or None - Excel sheet
    - x_row_or_col: str or int - X-axis data column/row
    - cell: str or tuple - Data cell range
    - interp: str - Interpolation method
    - py_name: str - Variable name
    
    Methods:
    - __call__(x_value) - Get interpolated lookup value
    - get_series_data() - Get original lookup table
    """

Usage Examples

from pysd.py_backend.external import ExtLookup

# Load price-demand lookup table
price_lookup = ExtLookup(
    file_name='market_data.xlsx',
    root='/data', 
    sheet='price_elasticity',
    x_row_or_col='price',
    py_name='demand_lookup'
)

# Load multi-dimensional efficiency table
efficiency_lookup = ExtLookup(
    file_name='efficiency_curves.csv',
    root='/data',
    x_row_or_col='temperature',
    interp='cubic'
)

# Use during simulation
demand_for_price_50 = price_lookup(50.0)
efficiency_at_temp_25 = efficiency_lookup(25.0)

External Constants

Load constant values from external files for model parameterization.

class ExtConstant(External):
    """
    Constants from external files.
    
    Loads scalar constant values from external data sources.
    Useful for model parameterization and configuration management.
    
    Parameters:
    - file_name: str - Path to constants file
    - root: str - Root directory
    - sheet: str or int or None - Excel sheet
    - cell: str or tuple - Specific cell containing constant
    - py_name: str - Variable name
    
    Methods:
    - __call__() - Get constant value
    - get_constant_value() - Get the stored constant
    """

Usage Examples

from pysd.py_backend.external import ExtConstant

# Load model parameters from configuration file
birth_rate_constant = ExtConstant(
    file_name='model_config.xlsx',
    root='/config',
    sheet='parameters',
    cell='B5',  # Specific cell
    py_name='base_birth_rate'
)

# Load from CSV
area_constant = ExtConstant(
    file_name='geographic_data.csv',
    root='/data',
    cell='total_area',
    py_name='country_area'
)

# Access constant values
birth_rate = birth_rate_constant()
total_area = area_constant()

External Subscripts

Load subscript definitions and ranges from external files for multi-dimensional variables.

class ExtSubscript(External):
    """
    Subscripts from external files.
    
    Loads subscript definitions (dimension ranges) from external sources.
    Enables dynamic model structure based on external configuration.
    
    Parameters:
    - file_name: str - Path to subscript definition file
    - root: str - Root directory
    - sheet: str or int or None - Excel sheet
    - py_name: str - Subscript name
    
    Methods:
    - __call__() - Get subscript range/definition
    - get_subscript_elements() - Get list of subscript elements
    """

Usage Examples

from pysd.py_backend.external import ExtSubscript

# Load region definitions
regions_subscript = ExtSubscript(
    file_name='geographic_structure.xlsx',
    root='/config',
    sheet='regions',
    py_name='model_regions'
)

# Load age group definitions
age_groups_subscript = ExtSubscript(
    file_name='demographic_structure.csv',
    root='/config', 
    py_name='age_categories'
)

# Get subscript elements
available_regions = regions_subscript.get_subscript_elements()
age_categories = age_groups_subscript.get_subscript_elements()

Excel File Caching

Utility class for efficient Excel file handling with caching and shared access.

class ExtSubscript(External):
    """
    External subscript data from Excel files implementing Vensim's GET XLS SUBSCRIPT and GET DIRECT SUBSCRIPT functions.
    
    Loads subscript values from Excel files to define model dimensions and array indices.
    Supports cell ranges and named ranges with optional prefix for subscript names.
    
    Methods:
    - __init__(file_name, tab, firstcell, lastcell, prefix, root) - Initialize subscript data source
    - get_subscripts_cell(col, row, lastcell) - Extract subscripts from cell range
    - get_subscripts_name(name) - Extract subscripts from named range
    """

class Excels:
    """
    Excel file caching utility.
    
    Manages Excel file loading and caching for efficient access to multiple
    sheets and ranges within the same file. Prevents repeated file loading.
    
    Methods:
    - __init__() - Initialize cache
    - get_sheet(file_path, sheet_name) - Get cached Excel sheet
    - clear_cache() - Clear all cached Excel data
    - get_file_info(file_path) - Get file metadata
    """

Usage Examples

from pysd.py_backend.external import Excels

# Create Excel cache manager
excel_cache = Excels()

# Multiple ExtData objects using same Excel file benefit from caching
data1 = ExtData('large_dataset.xlsx', sheet='Sheet1', ...)
data2 = ExtData('large_dataset.xlsx', sheet='Sheet2', ...)
data3 = ExtData('large_dataset.xlsx', sheet='Sheet3', ...)

# File is loaded only once and cached for reuse
# Clear cache when memory management needed
excel_cache.clear_cache()

Data File Format Support

PySD supports various external data formats:

CSV Files

# CSV with time column
time,population,gdp
0,1000,5000
1,1050,5250
2,1100,5500

Excel Files

# Multiple sheets supported
# Sheet names or indices can be specified
# Cell ranges: 'A1:C10' or (1,1,3,10)

NetCDF Files

# For large datasets and model output
# Supports multi-dimensional arrays
# Automatic coordinate handling

Integration with Model Loading

External data is typically integrated during model loading:

import pysd

# Load model with external data files
model = pysd.read_vensim(
    'population_model.mdl',
    data_files={
        'demographics.csv': ['birth_rate', 'death_rate'],
        'economic.xlsx': ['gdp_growth', 'unemployment']
    },
    data_files_encoding='utf-8'
)

# External data automatically available in model
results = model.run()

Advanced Data Handling

Missing Value Strategies

# Configure missing value handling during model loading
model = pysd.read_vensim(
    'model.mdl',
    data_files=['incomplete_data.csv'],
    missing_values='warning'  # 'error', 'ignore', 'keep'
)

Encoding Management

# Handle different file encodings
model = pysd.read_vensim(
    'model.mdl',
    data_files=['international_data.csv'],
    data_files_encoding={
        'international_data.csv': 'utf-8'
    }
)

Data Serialization

Export external data to netCDF format for efficient storage and access:

# Export model's external data
model.serialize_externals(
    export_path='model_externals.nc',
    time_coords={'time': range(0, 101)},
    compression_level=4
)

# Load model with serialized externals
model_with_nc = pysd.load(
    'model.py',
    data_files='model_externals.nc'
)

Error Handling

External data components provide comprehensive error handling:

  • FileNotFoundError: Missing data files
  • KeyError: Missing columns or sheets
  • ValueError: Invalid data formats or ranges
  • UnicodeDecodeError: Encoding issues
  • InterpolationError: Problems with data interpolation
try:
    data = ExtData('missing_file.csv', root='/data')
    data.initialize()
except FileNotFoundError:
    print("Data file not found, using default values")
    
try:
    value = data(time_point)
except ValueError as e:
    print(f"Interpolation error: {e}")

Performance Optimization

For efficient external data usage:

  • Cache frequently accessed files using Excels class
  • Use appropriate interpolation methods for data characteristics
  • Consider data preprocessing for very large datasets
  • Utilize netCDF format for complex multi-dimensional data
# Efficient pattern for multiple data sources
excel_manager = Excels()

# All data objects share cached Excel file
population_data = ExtData('master_data.xlsx', sheet='population')
economic_data = ExtData('master_data.xlsx', sheet='economy') 
social_data = ExtData('master_data.xlsx', sheet='social')

Install with Tessl CLI

npx tessl i tessl/pypi-pysd

docs

cli-tools.md

external-data.md

functions-module.md

index.md

model-loading.md

model-simulation.md

parameter-management.md

stateful-components.md

utils-module.md

tile.json