or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/pypi-thepassiveinvestor

ETF analysis and reporting tool for passive investing with comprehensive data collection and Excel report generation.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/thepassiveinvestor@1.2.x

To install, run

npx @tessl/cli install tessl/pypi-thepassiveinvestor@1.2.0

index.mddocs/

The Passive Investor

A comprehensive Python library for ETF (Exchange-Traded Fund) analysis and reporting. Enables passive investors to collect, analyze, and compare ETF data to make informed investment decisions with automated Excel report generation.

Package Information

  • Package Name: thepassiveinvestor
  • Version: 1.2.2
  • Language: Python
  • Installation: pip install thepassiveinvestor
  • Requirements: Python >=3.10, <3.13

Core Imports

import thepassiveinvestor as pi

Individual imports:

from thepassiveinvestor import collect_data, create_ETF_report

Basic Usage

import thepassiveinvestor as pi

# Collect data from a single ETF
vanguard_sp500 = pi.collect_data('VOO')
print(vanguard_sp500['long_name'])  # "Vanguard 500 Index Fund"

# Create an Excel report for multiple ETFs
pi.create_ETF_report(['VOO', 'QQQ', 'ARKG'], 'my_etf_analysis.xlsx')

# Compare multiple ETFs in a DataFrame
comparison_df = pi.collect_data(['VOO', 'QQQ', 'VTI'], comparison=True)
print(comparison_df.head())

Capabilities

ETF Data Collection

Collects comprehensive ETF data from Yahoo Finance including sector holdings, company holdings, annual returns, risk statistics, and key characteristics.

def collect_data(tickers, comparison=False, surpress_print=False):
    """
    Collect comprehensive ETF data from Yahoo Finance.

    Parameters:
    - tickers (str or list): Single ticker string or list of ETF ticker symbols (e.g., 'VOO' or ['VOO', 'QQQ'])
    - comparison (bool, default False): If True, returns a comparison DataFrame instead of dictionary
    - surpress_print (bool, default False): If True, suppresses error messages for failed tickers

    Returns:
    - dict: Single ETF data dictionary (when single ticker and comparison=False)
    - dict: Dictionary of ticker->data mappings (when multiple tickers and comparison=False)  
    - pandas.DataFrame: Multi-index comparison DataFrame (when comparison=True)

    Collected Data Fields:
    - long_name: ETF full name
    - summary: Fund description and strategy
    - image_URL: Style box image URL
    - sector_holdings: Sector allocation percentages
    - company_holdings: Top company holdings with percentages
    - annual_returns: Historical annual returns (last 6 years)
    - risk_data: Risk metrics by time period (3y, 5y, 10y)
    - key_characteristics: Fund inception date, category, total assets, NAV, currency, previous close
    """

Excel Report Generation

Creates comprehensive Excel reports with multiple sheets containing ETF data, formatted tables, charts, and images.

def create_ETF_report(tickers, filename, folder=None):
    """
    Generate comprehensive Excel report with ETF analysis data.

    Parameters:  
    - tickers (str or list): Single ticker string or list of ETF ticker symbols
    - filename (str): Output filename (automatically appends .xlsx if missing)
    - folder (str, optional): Directory path for output file

    Returns:
    - None: Creates Excel file at specified location

    Report Features:
    - Individual sheet per ETF with formatted data tables
    - Stock price history chart for each ETF
    - Sector holdings with percentage formatting
    - Top company holdings tables
    - Risk statistics comparison (3y, 5y, 10y)
    - Annual returns history
    - ETF style box images
    - Hidden stock data sheet for chart generation
    """

Configuration Constants

Predefined configuration constants for customizing data collection and display formatting.

DEFAULT_KEY_STATISTICS_CHOICES: dict
# Maps Yahoo Finance key statistics field names to display names
# Fields: fundInceptionDate -> "Fund Inception Date", category -> "Category", totalAssets -> "Total Assets"

DEFAULT_SUMMARY_DETAIL_CHOICES: dict  
# Maps Yahoo Finance summary detail field names to display names
# Fields: currency -> "Currency", navPrice -> "Net Asset Value", previousClose -> "Previous Close"

EMPTY_RISK_STATISTICS: dict
# Default risk statistics template used when historical data unavailable
# Fields: Years, Alpha, Beta, Mean Annual Return, R-squared, Standard Deviation, Sharpe Ratio, Treynor Ratio

Utility Functions

Internal utility functions for Excel manipulation and formatting (exported but primarily for internal use).

def data_placer(data, sheet, starting_row, column, column_key, column_value, 
                horizontal_alignment_key=False, horizontal_alignment_value=False,
                change_key_dimensions=True, change_value_dimensions=True, 
                key_number=None, value_percentage=None):
    """
    Place dictionary data in Excel sheet with formatting options.
    
    Parameters:
    - data (dict): Data to place in sheet
    - sheet (openpyxl.Worksheet): Target Excel worksheet
    - starting_row (int): First row for data placement
    - column (int): Column number for key placement
    - column_key (str): Column letter for keys (e.g., 'B')
    - column_value (str): Column letter for values (e.g., 'C')
    - horizontal_alignment_key (str or False): Horizontal alignment for keys
    - horizontal_alignment_value (str or False): Horizontal alignment for values  
    - change_key_dimensions (bool): Auto-adjust key column width
    - change_value_dimensions (bool): Auto-adjust value column width
    - key_number (bool or None): Convert keys to numbers if possible
    - value_percentage (bool or None): Format values as percentages
    """

def image_placer(image_url, sheet, location):
    """
    Download and place image in Excel sheet.
    
    Parameters:
    - image_url (str): URL of image to download
    - sheet (openpyxl.Worksheet): Target Excel worksheet  
    - location (str): Cell location for image (e.g., 'L12')
    """

def graph_placer(stock_sheet, stock_data, sheet, min_col, min_row, max_col, location):
    """
    Create and place stock price line chart in Excel sheet.
    
    Parameters:
    - stock_sheet (openpyxl.Worksheet): Sheet containing stock data
    - stock_data (pandas.DataFrame): Stock price data
    - sheet (openpyxl.Worksheet): Target sheet for chart
    - min_col (int): Minimum column for data range
    - min_row (int): Minimum row for data range
    - max_col (int): Maximum column for data range
    - location (str): Cell location for chart placement
    """

Data Structures

ETF Data Dictionary

The collect_data() function returns a dictionary with the following structure:

# ETF Data Dictionary Structure
{
    'long_name': str,              # ETF full name (e.g., "Vanguard 500 Index Fund")
    'summary': str,                # Fund description and investment strategy  
    'image_URL': str,              # URL to ETF style box image
    'sector_holdings': {           # Sector allocation percentages
        'technology': '23.65%',
        'healthcare': '15.27%',
        'financial_services': '13.7%',
        # ... other sectors
    },
    'company_holdings': {          # Top company holdings with percentages
        'Apple Inc': '5.92%',
        'Microsoft Corp': '5.62%', 
        # ... other holdings
    },
    'annual_returns': {            # Historical annual returns (last 6 years)
        '2022': '-18.15%',
        '2021': '28.66%',
        # ... other years
    },
    'risk_data': {                 # Risk metrics by time period
        '3y': {
            'year': '3y',
            'alpha': -0.04,
            'beta': 1.0,
            'meanAnnualReturn': 0.8,
            'rSquared': 100,
            'stdDev': 21.17,
            'sharpeRatio': -0.55,
            'treynorRatio': 6.76
        },
        '5y': { /* similar structure */ },
        '10y': { /* similar structure */ }
    },
    'key_characteristics': {       # Fund characteristics and current data
        'fundInceptionDate': datetime.date(2010, 9, 7),
        'category': 'Large Blend',
        'totalAssets': 744769716224,
        'currency': 'USD',
        'navPrice': 366.24,
        'previousClose': 365.67
    }
}

Comparison DataFrame Structure

When comparison=True, returns a pandas DataFrame with:

# MultiIndex DataFrame structure
# Index: (category, field) tuples like ('sector_holdings', 'technology')
# Columns: ETF ticker symbols
# Values: Corresponding data values (percentages, numbers, dates)

# Example rows:
# ('sector_holdings', 'technology')     | VOO: '23.65%' | QQQ: '47.62%'
# ('annual_returns', '2022')           | VOO: '-18.15%'| QQQ: '-32.49%'  
# ('key_characteristics', 'category')  | VOO: 'Large Blend' | QQQ: 'Large Growth'

Error Handling

The library implements graceful error handling for common issues:

  • Invalid tickers: Failed tickers are skipped with optional warning messages
  • Missing data: Uses default values and empty structures when specific data unavailable
  • Network issues: Handles Yahoo Finance API failures gracefully
  • Image download failures: Falls back to text placeholder when style box images unavailable
  • Excel generation errors: Continues processing other ETFs when individual ticker fails

Use surpress_print=True in collect_data() to disable error messages for production use.

Dependencies

Core dependencies automatically installed:

  • pandas (with excel, computation, performance extras): Data manipulation and analysis
  • yahooquery: Yahoo Finance data access
  • urllib3: HTTP client for image downloads
  • openpyxl (via pandas): Excel file generation and manipulation