ETF analysis and reporting tool for passive investing with comprehensive data collection and Excel report generation.
npx @tessl/cli install tessl/pypi-thepassiveinvestor@1.2.00
# The Passive Investor
1
2
A comprehensive Python library for ETF (Exchange-Traded Fund) analysis and reporting. Enables passive investors to collect, analyze, and compare ETF data to make informed investment decisions with automated Excel report generation.
3
4
## Package Information
5
6
- **Package Name**: thepassiveinvestor
7
- **Version**: 1.2.2
8
- **Language**: Python
9
- **Installation**: `pip install thepassiveinvestor`
10
- **Requirements**: Python >=3.10, <3.13
11
12
## Core Imports
13
14
```python
15
import thepassiveinvestor as pi
16
```
17
18
Individual imports:
19
20
```python
21
from thepassiveinvestor import collect_data, create_ETF_report
22
```
23
24
## Basic Usage
25
26
```python
27
import thepassiveinvestor as pi
28
29
# Collect data from a single ETF
30
vanguard_sp500 = pi.collect_data('VOO')
31
print(vanguard_sp500['long_name']) # "Vanguard 500 Index Fund"
32
33
# Create an Excel report for multiple ETFs
34
pi.create_ETF_report(['VOO', 'QQQ', 'ARKG'], 'my_etf_analysis.xlsx')
35
36
# Compare multiple ETFs in a DataFrame
37
comparison_df = pi.collect_data(['VOO', 'QQQ', 'VTI'], comparison=True)
38
print(comparison_df.head())
39
```
40
41
## Capabilities
42
43
### ETF Data Collection
44
45
Collects comprehensive ETF data from Yahoo Finance including sector holdings, company holdings, annual returns, risk statistics, and key characteristics.
46
47
```python { .api }
48
def collect_data(tickers, comparison=False, surpress_print=False):
49
"""
50
Collect comprehensive ETF data from Yahoo Finance.
51
52
Parameters:
53
- tickers (str or list): Single ticker string or list of ETF ticker symbols (e.g., 'VOO' or ['VOO', 'QQQ'])
54
- comparison (bool, default False): If True, returns a comparison DataFrame instead of dictionary
55
- surpress_print (bool, default False): If True, suppresses error messages for failed tickers
56
57
Returns:
58
- dict: Single ETF data dictionary (when single ticker and comparison=False)
59
- dict: Dictionary of ticker->data mappings (when multiple tickers and comparison=False)
60
- pandas.DataFrame: Multi-index comparison DataFrame (when comparison=True)
61
62
Collected Data Fields:
63
- long_name: ETF full name
64
- summary: Fund description and strategy
65
- image_URL: Style box image URL
66
- sector_holdings: Sector allocation percentages
67
- company_holdings: Top company holdings with percentages
68
- annual_returns: Historical annual returns (last 6 years)
69
- risk_data: Risk metrics by time period (3y, 5y, 10y)
70
- key_characteristics: Fund inception date, category, total assets, NAV, currency, previous close
71
"""
72
```
73
74
### Excel Report Generation
75
76
Creates comprehensive Excel reports with multiple sheets containing ETF data, formatted tables, charts, and images.
77
78
```python { .api }
79
def create_ETF_report(tickers, filename, folder=None):
80
"""
81
Generate comprehensive Excel report with ETF analysis data.
82
83
Parameters:
84
- tickers (str or list): Single ticker string or list of ETF ticker symbols
85
- filename (str): Output filename (automatically appends .xlsx if missing)
86
- folder (str, optional): Directory path for output file
87
88
Returns:
89
- None: Creates Excel file at specified location
90
91
Report Features:
92
- Individual sheet per ETF with formatted data tables
93
- Stock price history chart for each ETF
94
- Sector holdings with percentage formatting
95
- Top company holdings tables
96
- Risk statistics comparison (3y, 5y, 10y)
97
- Annual returns history
98
- ETF style box images
99
- Hidden stock data sheet for chart generation
100
"""
101
```
102
103
### Configuration Constants
104
105
Predefined configuration constants for customizing data collection and display formatting.
106
107
```python { .api }
108
DEFAULT_KEY_STATISTICS_CHOICES: dict
109
# Maps Yahoo Finance key statistics field names to display names
110
# Fields: fundInceptionDate -> "Fund Inception Date", category -> "Category", totalAssets -> "Total Assets"
111
112
DEFAULT_SUMMARY_DETAIL_CHOICES: dict
113
# Maps Yahoo Finance summary detail field names to display names
114
# Fields: currency -> "Currency", navPrice -> "Net Asset Value", previousClose -> "Previous Close"
115
116
EMPTY_RISK_STATISTICS: dict
117
# Default risk statistics template used when historical data unavailable
118
# Fields: Years, Alpha, Beta, Mean Annual Return, R-squared, Standard Deviation, Sharpe Ratio, Treynor Ratio
119
```
120
121
### Utility Functions
122
123
Internal utility functions for Excel manipulation and formatting (exported but primarily for internal use).
124
125
```python { .api }
126
def data_placer(data, sheet, starting_row, column, column_key, column_value,
127
horizontal_alignment_key=False, horizontal_alignment_value=False,
128
change_key_dimensions=True, change_value_dimensions=True,
129
key_number=None, value_percentage=None):
130
"""
131
Place dictionary data in Excel sheet with formatting options.
132
133
Parameters:
134
- data (dict): Data to place in sheet
135
- sheet (openpyxl.Worksheet): Target Excel worksheet
136
- starting_row (int): First row for data placement
137
- column (int): Column number for key placement
138
- column_key (str): Column letter for keys (e.g., 'B')
139
- column_value (str): Column letter for values (e.g., 'C')
140
- horizontal_alignment_key (str or False): Horizontal alignment for keys
141
- horizontal_alignment_value (str or False): Horizontal alignment for values
142
- change_key_dimensions (bool): Auto-adjust key column width
143
- change_value_dimensions (bool): Auto-adjust value column width
144
- key_number (bool or None): Convert keys to numbers if possible
145
- value_percentage (bool or None): Format values as percentages
146
"""
147
148
def image_placer(image_url, sheet, location):
149
"""
150
Download and place image in Excel sheet.
151
152
Parameters:
153
- image_url (str): URL of image to download
154
- sheet (openpyxl.Worksheet): Target Excel worksheet
155
- location (str): Cell location for image (e.g., 'L12')
156
"""
157
158
def graph_placer(stock_sheet, stock_data, sheet, min_col, min_row, max_col, location):
159
"""
160
Create and place stock price line chart in Excel sheet.
161
162
Parameters:
163
- stock_sheet (openpyxl.Worksheet): Sheet containing stock data
164
- stock_data (pandas.DataFrame): Stock price data
165
- sheet (openpyxl.Worksheet): Target sheet for chart
166
- min_col (int): Minimum column for data range
167
- min_row (int): Minimum row for data range
168
- max_col (int): Maximum column for data range
169
- location (str): Cell location for chart placement
170
"""
171
```
172
173
## Data Structures
174
175
### ETF Data Dictionary
176
177
The `collect_data()` function returns a dictionary with the following structure:
178
179
```python { .api }
180
# ETF Data Dictionary Structure
181
{
182
'long_name': str, # ETF full name (e.g., "Vanguard 500 Index Fund")
183
'summary': str, # Fund description and investment strategy
184
'image_URL': str, # URL to ETF style box image
185
'sector_holdings': { # Sector allocation percentages
186
'technology': '23.65%',
187
'healthcare': '15.27%',
188
'financial_services': '13.7%',
189
# ... other sectors
190
},
191
'company_holdings': { # Top company holdings with percentages
192
'Apple Inc': '5.92%',
193
'Microsoft Corp': '5.62%',
194
# ... other holdings
195
},
196
'annual_returns': { # Historical annual returns (last 6 years)
197
'2022': '-18.15%',
198
'2021': '28.66%',
199
# ... other years
200
},
201
'risk_data': { # Risk metrics by time period
202
'3y': {
203
'year': '3y',
204
'alpha': -0.04,
205
'beta': 1.0,
206
'meanAnnualReturn': 0.8,
207
'rSquared': 100,
208
'stdDev': 21.17,
209
'sharpeRatio': -0.55,
210
'treynorRatio': 6.76
211
},
212
'5y': { /* similar structure */ },
213
'10y': { /* similar structure */ }
214
},
215
'key_characteristics': { # Fund characteristics and current data
216
'fundInceptionDate': datetime.date(2010, 9, 7),
217
'category': 'Large Blend',
218
'totalAssets': 744769716224,
219
'currency': 'USD',
220
'navPrice': 366.24,
221
'previousClose': 365.67
222
}
223
}
224
```
225
226
### Comparison DataFrame Structure
227
228
When `comparison=True`, returns a pandas DataFrame with:
229
230
```python { .api }
231
# MultiIndex DataFrame structure
232
# Index: (category, field) tuples like ('sector_holdings', 'technology')
233
# Columns: ETF ticker symbols
234
# Values: Corresponding data values (percentages, numbers, dates)
235
236
# Example rows:
237
# ('sector_holdings', 'technology') | VOO: '23.65%' | QQQ: '47.62%'
238
# ('annual_returns', '2022') | VOO: '-18.15%'| QQQ: '-32.49%'
239
# ('key_characteristics', 'category') | VOO: 'Large Blend' | QQQ: 'Large Growth'
240
```
241
242
## Error Handling
243
244
The library implements graceful error handling for common issues:
245
246
- **Invalid tickers**: Failed tickers are skipped with optional warning messages
247
- **Missing data**: Uses default values and empty structures when specific data unavailable
248
- **Network issues**: Handles Yahoo Finance API failures gracefully
249
- **Image download failures**: Falls back to text placeholder when style box images unavailable
250
- **Excel generation errors**: Continues processing other ETFs when individual ticker fails
251
252
Use `surpress_print=True` in `collect_data()` to disable error messages for production use.
253
254
## Dependencies
255
256
Core dependencies automatically installed:
257
258
- **pandas** (with excel, computation, performance extras): Data manipulation and analysis
259
- **yahooquery**: Yahoo Finance data access
260
- **urllib3**: HTTP client for image downloads
261
- **openpyxl** (via pandas): Excel file generation and manipulation