Tessl Tile for pypi/vectorbt@0.28.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

data-management.md generic-analysis.md index.md indicators-signals.md label-generation.md portfolio-analysis.md records-management.md utilities-config.md

label-generation.mddocs/

0
# Label Generation for Machine Learning
1

2
Look-ahead analysis tools for generating labels from future price movements, enabling machine learning model training on financial time series data. The labels module provides various methods to create target variables for supervised learning applications in quantitative finance.
3

4
## Capabilities
5

6
### Future Statistical Measures
7

8
Generators for statistical measures computed over future time windows, commonly used for regression and forecasting tasks.
9

10
```python { .api }
11
class FMEAN:
12
    """
13
    Future mean label generator.
14
    
15
    Calculates the mean of future values over a specified window,
16
    useful for predicting future average prices or returns.
17
    """
18
    
19
    @classmethod
20
    def run(cls, close, window, **kwargs):
21
        """
22
        Calculate future mean labels.
23
        
24
        Parameters:
25
        - close: pd.Series or pd.DataFrame, price data
26
        - window: int, forward-looking window size
27
        - pct_change: bool, use percentage change (default: False)
28
        
29
        Returns:
30
        FMEAN: Label generator with fmean attribute
31
        """
32

33
class FSTD:
34
    """
35
    Future standard deviation label generator.
36
    
37
    Calculates the standard deviation of future values over a window,
38
    useful for volatility prediction and risk modeling.
39
    """
40
    
41
    @classmethod
42
    def run(cls, close, window, **kwargs):
43
        """
44
        Calculate future standard deviation labels.
45
        
46
        Parameters:
47
        - close: pd.Series or pd.DataFrame, price data
48
        - window: int, forward-looking window size
49
        - pct_change: bool, use percentage change (default: False)
50
        - ddof: int, degrees of freedom (default: 1)
51
        
52
        Returns:
53
        FSTD: Label generator with fstd attribute
54
        """
55

56
class FMIN:
57
    """
58
    Future minimum label generator.
59
    
60
    Finds the minimum value over future time windows,
61
    useful for support level prediction and drawdown analysis.
62
    """
63
    
64
    @classmethod
65
    def run(cls, close, window, **kwargs):
66
        """
67
        Calculate future minimum labels.
68
        
69
        Parameters:
70
        - close: pd.Series or pd.DataFrame, price data
71
        - window: int, forward-looking window size
72
        - pct_change: bool, use percentage change from current (default: False)
73
        
74
        Returns:
75
        FMIN: Label generator with fmin attribute
76
        """
77

78
class FMAX:
79
    """
80
    Future maximum label generator.
81
    
82
    Finds the maximum value over future time windows,
83
    useful for resistance level prediction and profit target analysis.
84
    """
85
    
86
    @classmethod
87
    def run(cls, close, window, **kwargs):
88
        """
89
        Calculate future maximum labels.
90
        
91
        Parameters:
92
        - close: pd.Series or pd.DataFrame, price data
93
        - window: int, forward-looking window size
94
        - pct_change: bool, use percentage change from current (default: False)
95
        
96
        Returns:
97
        FMAX: Label generator with fmax attribute
98
        """
99
```
100

101
### Fixed and Mean-Based Labels
102

103
Simple labeling methods for basic classification and regression tasks.
104

105
```python { .api }
106
class FIXLB:
107
    """
108
    Fixed label generator.
109
    
110
    Generates constant labels across all time periods,
111
    useful for baseline models and control experiments.
112
    """
113
    
114
    @classmethod
115
    def run(cls, shape, value=1, **kwargs):
116
        """
117
        Generate fixed labels.
118
        
119
        Parameters:
120
        - shape: tuple, output shape (n_rows, n_cols)
121
        - value: scalar, fixed label value
122
        - dtype: data type for labels
123
        
124
        Returns:
125
        FIXLB: Label generator with fixed labels
126
        """
127

128
class MEANLB:
129
    """
130
    Mean-based label generator.
131
    
132
    Generates labels based on deviations from mean values,
133
    useful for mean reversion strategies and anomaly detection.
134
    """
135
    
136
    @classmethod
137
    def run(cls, close, window, threshold=0, **kwargs):
138
        """
139
        Generate mean-based labels.
140
        
141
        Parameters:
142
        - close: pd.Series or pd.DataFrame, price data
143
        - window: int, rolling window for mean calculation
144
        - threshold: float, threshold for label generation
145
        - above: bool, label when above mean (default: True)
146
        
147
        Returns:
148
        MEANLB: Label generator with mean-based labels
149
        """
150
```
151

152
### Lexicographic and Ranking Labels
153

154
Advanced labeling methods for ranking and relative performance analysis.
155

156
```python { .api }
157
class LEXLB:
158
    """
159
    Lexicographic label generator.
160
    
161
    Generates labels based on lexicographic ordering of multiple criteria,
162
    useful for multi-objective optimization and ranking problems.
163
    """
164
    
165
    @classmethod
166
    def run(cls, *args, **kwargs):
167
        """
168
        Generate lexicographic labels.
169
        
170
        Parameters:
171
        - args: sequence of arrays for lexicographic comparison
172
        - descending: bool, use descending order (default: False)
173
        
174
        Returns:
175
        LEXLB: Label generator with lexicographic rankings
176
        """
177
```
178

179
### Trend-Based Labels
180

181
Sophisticated trend analysis and classification for directional predictions.
182

183
```python { .api }
184
class TRENDLB:
185
    """
186
    Trend-based label generator.
187
    
188
    Analyzes price trends over various time horizons and generates
189
    labels for trend direction, strength, and continuation patterns.
190
    """
191
    
192
    @classmethod
193
    def run(cls, close, window=20, mode='binary', **kwargs):
194
        """
195
        Generate trend-based labels.
196
        
197
        Parameters:
198
        - close: pd.Series or pd.DataFrame, price data
199
        - window: int, trend analysis window
200
        - mode: str, trend mode (see TrendMode enum)
201
        - min_pct_change: float, minimum change for trend (default: 0.01)
202
        - smooth_window: int, smoothing window for trend (default: None)
203
        
204
        Returns:
205
        TRENDLB: Label generator with trend labels
206
        """
207

208
class TrendMode(IntEnum):
209
    """
210
    Trend calculation modes for TRENDLB.
211
    
212
    Defines different methods for calculating and categorizing trends
213
    in financial time series data.
214
    """
215
    Binary = 0          # Simple up/down binary classification
216
    BinaryCont = 1      # Binary with continuation signals
217
    BinaryContSat = 2   # Binary with continuation and saturation
218
    PctChange = 3       # Percentage change-based trends
219
    PctChangeNorm = 4   # Normalized percentage change trends
220
```
221

222
### Binary Outcome Labels
223

224
Specialized generators for binary classification tasks in trading applications.
225

226
```python { .api }
227
class BOLB:
228
    """
229
    Binary outcome label generator.
230
    
231
    Generates binary labels for classification tasks such as
232
    profitable/unprofitable trades or directional movements.
233
    """
234
    
235
    @classmethod
236
    def run(cls, close, window, threshold=0, **kwargs):
237
        """
238
        Generate binary outcome labels.
239
        
240
        Parameters:
241
        - close: pd.Series or pd.DataFrame, price data
242
        - window: int, forward-looking window for outcome
243
        - threshold: float, threshold for binary classification
244
        - return_type: str, type of return calculation ('simple', 'log')
245
        - min_periods: int, minimum periods for valid calculation
246
        
247
        Returns:
248
        BOLB: Label generator with binary outcome labels
249
        """
250
```
251

252
## Usage Examples
253

254
### Basic Future Labels
255

256
```python
257
import vectorbt as vbt
258
import pandas as pd
259

260
# Download data
261
data = vbt.YFData.download("AAPL", start="2020-01-01", end="2023-01-01")
262
close = data.get("Close")
263

264
# Generate future statistical labels
265
future_mean = vbt.FMEAN.run(close, window=5)
266
future_std = vbt.FSTD.run(close, window=10)
267
future_min = vbt.FMIN.run(close, window=20, pct_change=True)
268
future_max = vbt.FMAX.run(close, window=20, pct_change=True)
269

270
# Access label values
271
mean_labels = future_mean.fmean
272
std_labels = future_std.fstd
273
min_labels = future_min.fmin  # Future minimum % change
274
max_labels = future_max.fmax  # Future maximum % change
275
```
276

277
### Trend Analysis Labels
278

279
```python
280
# Generate trend-based labels with different modes
281
trend_binary = vbt.TRENDLB.run(
282
    close, 
283
    window=20, 
284
    mode='binary'
285
)
286

287
trend_pct = vbt.TRENDLB.run(
288
    close,
289
    window=20,
290
    mode='pct_change',
291
    min_pct_change=0.02  # 2% minimum change
292
)
293

294
trend_smooth = vbt.TRENDLB.run(
295
    close,
296
    window=20,
297
    mode='binary_cont',
298
    smooth_window=5
299
)
300

301
# Access trend labels
302
binary_trends = trend_binary.trend
303
pct_trends = trend_pct.trend
304
smooth_trends = trend_smooth.trend
305
```
306

307
### Classification Labels for ML
308

309
```python
310
# Binary outcome labels for profitable trades
311
profitable_trades = vbt.BOLB.run(
312
    close,
313
    window=10,  # 10-day forward window
314
    threshold=0.05,  # 5% profit threshold
315
    return_type='simple'
316
)
317

318
# Mean reversion labels
319
mean_reversion = vbt.MEANLB.run(
320
    close,
321
    window=20,  # 20-day rolling mean
322
    threshold=0.02,  # 2% deviation threshold
323
    above=True  # Label when above mean
324
)
325

326
# Access binary labels
327
profit_labels = profitable_trades.labels  # True for profitable periods
328
reversion_labels = mean_reversion.labels  # True when above mean
329
```
330

331
### Multi-Asset Label Generation
332

333
```python
334
# Download multiple assets
335
symbols = ["AAPL", "GOOGL", "MSFT", "TSLA"]
336
data = vbt.YFData.download(symbols, start="2020-01-01", end="2023-01-01")
337
close = data.get("Close")
338

339
# Generate labels for all assets
340
future_returns = {}
341
trend_labels = {}
342

343
for symbol in symbols:
344
    # Future return labels
345
    future_returns[symbol] = vbt.FMEAN.run(
346
        close[symbol], 
347
        window=5, 
348
        pct_change=True
349
    ).fmean
350
    
351
    # Trend labels
352
    trend_labels[symbol] = vbt.TRENDLB.run(
353
        close[symbol],
354
        window=20,
355
        mode='binary'
356
    ).trend
357

358
# Combine into DataFrames
359
future_returns_df = pd.DataFrame(future_returns)
360
trend_labels_df = pd.DataFrame(trend_labels)
361
```
362

363
### Labels for Strategy Development
364

365
```python
366
# Generate labels for different time horizons
367
short_term = vbt.FMAX.run(close, window=5, pct_change=True)   # 5-day max return
368
medium_term = vbt.FMAX.run(close, window=20, pct_change=True) # 20-day max return  
369
long_term = vbt.FMAX.run(close, window=60, pct_change=True)   # 60-day max return
370

371
# Create multi-horizon labels
372
horizon_labels = pd.DataFrame({
373
    'short_max': short_term.fmax,
374
    'medium_max': medium_term.fmax,
375
    'long_max': long_term.fmax
376
})
377

378
# Classification thresholds
379
horizon_labels['short_profitable'] = horizon_labels['short_max'] > 0.03
380
horizon_labels['medium_profitable'] = horizon_labels['medium_max'] > 0.10
381
horizon_labels['long_profitable'] = horizon_labels['long_max'] > 0.25
382
```
383

384
### Advanced ML Pipeline
385

386
```python
387
import numpy as np
388
from sklearn.model_selection import train_test_split
389
from sklearn.ensemble import RandomForestClassifier
390

391
# Generate features (indicators)
392
ma_20 = vbt.MA.run(close, 20).ma
393
ma_50 = vbt.MA.run(close, 50).ma
394
rsi = vbt.RSI.run(close, 14).rsi
395
macd = vbt.MACD.run(close)
396

397
# Create feature matrix
398
features = pd.DataFrame({
399
    'ma_ratio': ma_20 / ma_50,
400
    'rsi': rsi,
401
    'macd': macd.macd,
402
    'macd_signal': macd.signal,
403
    'returns_5d': close.pct_change(5),
404
    'volatility': close.rolling(20).std()
405
})
406

407
# Generate labels
408
target = vbt.BOLB.run(
409
    close,
410
    window=10,
411
    threshold=0.05,  # 5% profit in next 10 days
412
    return_type='simple'
413
).labels
414

415
# Prepare data for ML
416
X = features.dropna()
417
y = target.reindex(X.index).dropna()
418

419
# Align X and y
420
common_index = X.index.intersection(y.index)
421
X = X.loc[common_index]
422
y = y.loc[common_index]
423

424
# Train-test split
425
X_train, X_test, y_train, y_test = train_test_split(
426
    X, y, test_size=0.2, random_state=42
427
)
428

429
# Train model
430
model = RandomForestClassifier(n_estimators=100, random_state=42)
431
model.fit(X_train, y_train)
432

433
# Evaluate
434
train_score = model.score(X_train, y_train)
435
test_score = model.score(X_test, y_test)
436
print(f"Train Score: {train_score:.3f}")
437
print(f"Test Score: {test_score:.3f}")
438
```
439

440
### Custom Label Generators
441

442
```python
443
class CustomVolatilityLabel:
444
    """Custom label for volatility regime classification."""
445
    
446
    @classmethod
447
    def run(cls, close, short_window=5, long_window=20, threshold=1.5):
448
        # Calculate short and long-term volatility
449
        short_vol = close.rolling(short_window).std()
450
        long_vol = close.rolling(long_window).std()
451
        
452
        # Volatility ratio
453
        vol_ratio = short_vol / long_vol
454
        
455
        # Classify regime
456
        labels = pd.Series(0, index=close.index)  # Low volatility
457
        labels[vol_ratio > threshold] = 1  # High volatility
458
        labels[vol_ratio > threshold * 1.5] = 2  # Very high volatility
459
        
460
        return labels
461

462
# Use custom label generator
463
vol_labels = CustomVolatilityLabel.run(close)
464
```

Version

Tile

Files

label-generation.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

label-generation.mddocs/