ETF analysis and reporting tool for passive investing with comprehensive data collection and Excel report generation.
npx @tessl/cli install tessl/pypi-thepassiveinvestor@1.2.0A comprehensive Python library for ETF (Exchange-Traded Fund) analysis and reporting. Enables passive investors to collect, analyze, and compare ETF data to make informed investment decisions with automated Excel report generation.
pip install thepassiveinvestorimport thepassiveinvestor as piIndividual imports:
from thepassiveinvestor import collect_data, create_ETF_reportimport thepassiveinvestor as pi
# Collect data from a single ETF
vanguard_sp500 = pi.collect_data('VOO')
print(vanguard_sp500['long_name']) # "Vanguard 500 Index Fund"
# Create an Excel report for multiple ETFs
pi.create_ETF_report(['VOO', 'QQQ', 'ARKG'], 'my_etf_analysis.xlsx')
# Compare multiple ETFs in a DataFrame
comparison_df = pi.collect_data(['VOO', 'QQQ', 'VTI'], comparison=True)
print(comparison_df.head())Collects comprehensive ETF data from Yahoo Finance including sector holdings, company holdings, annual returns, risk statistics, and key characteristics.
def collect_data(tickers, comparison=False, surpress_print=False):
"""
Collect comprehensive ETF data from Yahoo Finance.
Parameters:
- tickers (str or list): Single ticker string or list of ETF ticker symbols (e.g., 'VOO' or ['VOO', 'QQQ'])
- comparison (bool, default False): If True, returns a comparison DataFrame instead of dictionary
- surpress_print (bool, default False): If True, suppresses error messages for failed tickers
Returns:
- dict: Single ETF data dictionary (when single ticker and comparison=False)
- dict: Dictionary of ticker->data mappings (when multiple tickers and comparison=False)
- pandas.DataFrame: Multi-index comparison DataFrame (when comparison=True)
Collected Data Fields:
- long_name: ETF full name
- summary: Fund description and strategy
- image_URL: Style box image URL
- sector_holdings: Sector allocation percentages
- company_holdings: Top company holdings with percentages
- annual_returns: Historical annual returns (last 6 years)
- risk_data: Risk metrics by time period (3y, 5y, 10y)
- key_characteristics: Fund inception date, category, total assets, NAV, currency, previous close
"""Creates comprehensive Excel reports with multiple sheets containing ETF data, formatted tables, charts, and images.
def create_ETF_report(tickers, filename, folder=None):
"""
Generate comprehensive Excel report with ETF analysis data.
Parameters:
- tickers (str or list): Single ticker string or list of ETF ticker symbols
- filename (str): Output filename (automatically appends .xlsx if missing)
- folder (str, optional): Directory path for output file
Returns:
- None: Creates Excel file at specified location
Report Features:
- Individual sheet per ETF with formatted data tables
- Stock price history chart for each ETF
- Sector holdings with percentage formatting
- Top company holdings tables
- Risk statistics comparison (3y, 5y, 10y)
- Annual returns history
- ETF style box images
- Hidden stock data sheet for chart generation
"""Predefined configuration constants for customizing data collection and display formatting.
DEFAULT_KEY_STATISTICS_CHOICES: dict
# Maps Yahoo Finance key statistics field names to display names
# Fields: fundInceptionDate -> "Fund Inception Date", category -> "Category", totalAssets -> "Total Assets"
DEFAULT_SUMMARY_DETAIL_CHOICES: dict
# Maps Yahoo Finance summary detail field names to display names
# Fields: currency -> "Currency", navPrice -> "Net Asset Value", previousClose -> "Previous Close"
EMPTY_RISK_STATISTICS: dict
# Default risk statistics template used when historical data unavailable
# Fields: Years, Alpha, Beta, Mean Annual Return, R-squared, Standard Deviation, Sharpe Ratio, Treynor RatioInternal utility functions for Excel manipulation and formatting (exported but primarily for internal use).
def data_placer(data, sheet, starting_row, column, column_key, column_value,
horizontal_alignment_key=False, horizontal_alignment_value=False,
change_key_dimensions=True, change_value_dimensions=True,
key_number=None, value_percentage=None):
"""
Place dictionary data in Excel sheet with formatting options.
Parameters:
- data (dict): Data to place in sheet
- sheet (openpyxl.Worksheet): Target Excel worksheet
- starting_row (int): First row for data placement
- column (int): Column number for key placement
- column_key (str): Column letter for keys (e.g., 'B')
- column_value (str): Column letter for values (e.g., 'C')
- horizontal_alignment_key (str or False): Horizontal alignment for keys
- horizontal_alignment_value (str or False): Horizontal alignment for values
- change_key_dimensions (bool): Auto-adjust key column width
- change_value_dimensions (bool): Auto-adjust value column width
- key_number (bool or None): Convert keys to numbers if possible
- value_percentage (bool or None): Format values as percentages
"""
def image_placer(image_url, sheet, location):
"""
Download and place image in Excel sheet.
Parameters:
- image_url (str): URL of image to download
- sheet (openpyxl.Worksheet): Target Excel worksheet
- location (str): Cell location for image (e.g., 'L12')
"""
def graph_placer(stock_sheet, stock_data, sheet, min_col, min_row, max_col, location):
"""
Create and place stock price line chart in Excel sheet.
Parameters:
- stock_sheet (openpyxl.Worksheet): Sheet containing stock data
- stock_data (pandas.DataFrame): Stock price data
- sheet (openpyxl.Worksheet): Target sheet for chart
- min_col (int): Minimum column for data range
- min_row (int): Minimum row for data range
- max_col (int): Maximum column for data range
- location (str): Cell location for chart placement
"""The collect_data() function returns a dictionary with the following structure:
# ETF Data Dictionary Structure
{
'long_name': str, # ETF full name (e.g., "Vanguard 500 Index Fund")
'summary': str, # Fund description and investment strategy
'image_URL': str, # URL to ETF style box image
'sector_holdings': { # Sector allocation percentages
'technology': '23.65%',
'healthcare': '15.27%',
'financial_services': '13.7%',
# ... other sectors
},
'company_holdings': { # Top company holdings with percentages
'Apple Inc': '5.92%',
'Microsoft Corp': '5.62%',
# ... other holdings
},
'annual_returns': { # Historical annual returns (last 6 years)
'2022': '-18.15%',
'2021': '28.66%',
# ... other years
},
'risk_data': { # Risk metrics by time period
'3y': {
'year': '3y',
'alpha': -0.04,
'beta': 1.0,
'meanAnnualReturn': 0.8,
'rSquared': 100,
'stdDev': 21.17,
'sharpeRatio': -0.55,
'treynorRatio': 6.76
},
'5y': { /* similar structure */ },
'10y': { /* similar structure */ }
},
'key_characteristics': { # Fund characteristics and current data
'fundInceptionDate': datetime.date(2010, 9, 7),
'category': 'Large Blend',
'totalAssets': 744769716224,
'currency': 'USD',
'navPrice': 366.24,
'previousClose': 365.67
}
}When comparison=True, returns a pandas DataFrame with:
# MultiIndex DataFrame structure
# Index: (category, field) tuples like ('sector_holdings', 'technology')
# Columns: ETF ticker symbols
# Values: Corresponding data values (percentages, numbers, dates)
# Example rows:
# ('sector_holdings', 'technology') | VOO: '23.65%' | QQQ: '47.62%'
# ('annual_returns', '2022') | VOO: '-18.15%'| QQQ: '-32.49%'
# ('key_characteristics', 'category') | VOO: 'Large Blend' | QQQ: 'Large Growth'The library implements graceful error handling for common issues:
Use surpress_print=True in collect_data() to disable error messages for production use.
Core dependencies automatically installed: