Tessl Tile for pypi/flaml@2.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

autogen.md automl.md default-estimators.md index.md online-learning.md tuning.md

online-learning.mddocs/

0
# Online Learning
1

2
Automated online learning system using Vowpal Wabbit with multiple model management, adaptive resource allocation, and real-time model selection. The online learning module is designed for streaming data scenarios where models need to continuously adapt to new information.
3

4
## Capabilities
5

6
### AutoVW Class
7

8
Main class for automated online learning with Vowpal Wabbit, managing multiple models simultaneously and selecting the best performer dynamically.
9

10
```python { .api }
11
class AutoVW:
12
    def __init__(self, max_live_model_num, search_space, init_config={},
13
                 min_resource_lease="auto", automl_runner_args={}, scheduler_args={},
14
                 model_select_policy="threshold_loss_ucb", metric="mae_clipped",
15
                 random_seed=None, model_selection_mode="min", cb_coef=None):
16
        """
17
        Initialize AutoVW for automated online learning.
18
        
19
        Args:
20
            max_live_model_num (int): Maximum number of 'live' models to maintain
21
            search_space (dict): Hyperparameter search space including both tunable 
22
                                and fixed hyperparameters
23
            init_config (dict): Initial partial or full configuration
24
            min_resource_lease (str or float): Minimum resource lease for models ('auto' or float)
25
            automl_runner_args (dict): Configuration for OnlineTrialRunner
26
            scheduler_args (dict): Configuration for scheduler
27
            model_select_policy (str): Model selection policy ('threshold_loss_ucb', etc.)
28
            metric (str): Loss function metric ('mae_clipped', 'mae', 'mse', 'absolute_loss')
29
            random_seed (int): Random seed for reproducibility
30
            model_selection_mode (str): Optimization mode ('min' or 'max')
31
            cb_coef (float): Sample complexity bound coefficient
32
        """
33
        
34
    def predict(self, data_sample):
35
        """
36
        Make prediction on a data sample.
37
        
38
        Args:
39
            data_sample: Input data sample in VW format or structured format
40
            
41
        Returns:
42
            Prediction value from the selected model
43
        """
44
        
45
    def learn(self, data_sample):
46
        """
47
        Update models with new data sample.
48
        
49
        Args:
50
            data_sample: Training data sample with features and label
51
        """
52
```
53

54
### Class Constants
55

56
```python { .api }
57
class AutoVW:
58
    WARMSTART_NUM = 100  # Number of warmstart samples
59
    AUTOMATIC = "_auto"  # Automatic configuration identifier
60
    VW_INTERACTION_ARG_NAME = "interactions"  # VW interactions argument name
61
```
62

63
### Supporting Classes
64

65
#### VowpalWabbitTrial
66

67
Individual Vowpal Wabbit trial representing a single model configuration.
68

69
```python { .api }
70
class VowpalWabbitTrial:
71
    """
72
    Individual VW model trial in online learning system.
73
    Manages a single VW model instance with specific hyperparameters.
74
    """
75
```
76

77
#### OnlineTrialRunner
78

79
Manages execution and coordination of multiple online learning trials.
80

81
```python { .api }
82
class OnlineTrialRunner:
83
    """
84
    Manages execution of online learning trials.
85
    Coordinates multiple VW models and handles resource allocation.
86
    """
87
```
88

89
### Utility Functions
90

91
```python { .api }
92
def get_ns_feature_dim_from_vw_example(vw_example):
93
    """
94
    Extract namespace feature dimensions from VW example.
95
    
96
    Args:
97
        vw_example (str): Vowpal Wabbit format example string
98
        
99
    Returns:
100
        dict: Dictionary mapping namespace to feature dimensions
101
    """
102
```
103

104
### Usage Examples
105

106
#### Basic Online Learning Setup
107
```python
108
from flaml import AutoVW
109

110
# Define search space for hyperparameters
111
search_space = {
112
    "learning_rate": {"_type": "loguniform", "_value": [0.001, 1.0]},
113
    "l1": {"_type": "loguniform", "_value": [1e-10, 1.0]},
114
    "l2": {"_type": "loguniform", "_value": [1e-10, 1.0]},
115
    "interactions": {"_type": "choice", "_value": [set(), {"ab"}, {"ac"}, {"ab", "ac"}]}
116
}
117

118
# Initialize AutoVW
119
autovw = AutoVW(
120
    max_live_model_num=5,
121
    search_space=search_space,
122
    init_config={"learning_rate": 0.1},
123
    metric="mae_clipped",
124
    random_seed=42
125
)
126

127
# Simulate streaming data
128
for i, data_sample in enumerate(streaming_data):
129
    # Make prediction
130
    prediction = autovw.predict(data_sample)
131
    
132
    # Update models with new sample
133
    autovw.learn(data_sample)
134
    
135
    if i % 1000 == 0:
136
        print(f"Processed {i} samples, latest prediction: {prediction}")
137
```
138

139
#### Advanced Configuration with Custom Policies
140
```python
141
from flaml import AutoVW
142

143
# Advanced search space with multiple hyperparameters
144
search_space = {
145
    "learning_rate": {"_type": "loguniform", "_value": [0.0001, 1.0]},
146
    "power_t": {"_type": "uniform", "_value": [0.0, 1.0]},
147
    "l1": {"_type": "loguniform", "_value": [1e-10, 1.0]}, 
148
    "l2": {"_type": "loguniform", "_value": [1e-10, 1.0]},
149
    "interactions": {"_type": "choice", "_value": [
150
        set(), {"ab"}, {"ac"}, {"bc"}, {"ab", "ac"}, {"ab", "bc"}, {"ac", "bc"}
151
    ]},
152
    "bit_precision": {"_type": "choice", "_value": [18, 20, 22, 24]}
153
}
154

155
# Custom runner and scheduler arguments
156
automl_runner_args = {
157
    "champion_test_policy": "loss_ucb",
158
    "remove_worse": True
159
}
160

161
scheduler_args = {
162
    "resource_dimension": "sample_size",
163
    "max_resource": 10000,
164
    "reduction_factor": 2
165
}
166

167
# Initialize with advanced configuration
168
autovw = AutoVW(
169
    max_live_model_num=10,
170
    search_space=search_space,
171
    init_config={"learning_rate": 0.05, "l1": 1e-6},
172
    min_resource_lease=100,
173
    automl_runner_args=automl_runner_args,
174
    scheduler_args=scheduler_args,
175
    model_select_policy="threshold_loss_ucb",
176
    metric="mae",  # Mean absolute error
177
    cb_coef=0.1,  # Confidence bound coefficient
178
    random_seed=123
179
)
180
```
181

182
#### Integration with Data Streams
183
```python
184
import pandas as pd
185
from flaml import AutoVW
186

187
# Search space for regression task
188
search_space = {
189
    "learning_rate": {"_type": "loguniform", "_value": [0.001, 0.5]},
190
    "l1": {"_type": "loguniform", "_value": [1e-8, 0.1]},
191
    "l2": {"_type": "loguniform", "_value": [1e-8, 0.1]}
192
}
193

194
autovw = AutoVW(
195
    max_live_model_num=3,
196
    search_space=search_space,
197
    metric="mse",
198
    model_selection_mode="min"
199
)
200

201
# Process streaming CSV data
202
def process_csv_stream(csv_file):
203
    for chunk in pd.read_csv(csv_file, chunksize=1000):
204
        for _, row in chunk.iterrows():
205
            # Convert to VW format: label |features feature1:value1 feature2:value2
206
            vw_sample = f"{row['target']} |features "
207
            vw_sample += " ".join([f"{col}:{row[col]}" for col in chunk.columns if col != 'target'])
208
            
209
            # Get prediction before updating
210
            pred = autovw.predict(vw_sample)
211
            
212
            # Update model
213
            autovw.learn(vw_sample)
214
            
215
            yield pred, row['target']
216

217
# Use with streaming data
218
predictions_and_actuals = list(process_csv_stream("streaming_data.csv"))
219
```
220

221
#### Multi-Class Classification Online Learning
222
```python
223
from flaml import AutoVW
224

225
# Search space for multi-class classification
226
search_space = {
227
    "learning_rate": {"_type": "loguniform", "_value": [0.01, 1.0]},
228
    "oaa": {"_type": "choice", "_value": [3, 5, 10]},  # One-Against-All classes
229
    "loss_function": {"_type": "choice", "_value": ["logistic", "hinge"]}
230
}
231

232
# Initialize for classification
233
autovw_classifier = AutoVW(
234
    max_live_model_num=4,
235
    search_space=search_space,
236
    init_config={"oaa": 3},
237
    metric="absolute_loss",
238
    random_seed=456
239
)
240

241
# Example with categorical features
242
def create_vw_multiclass_sample(features, label):
243
    """Convert features to VW multi-class format."""
244
    vw_line = f"{label} |features "
245
    
246
    for key, value in features.items():
247
        if isinstance(value, str):
248
            # Categorical feature
249
            vw_line += f"{key}_{value}:1 "
250
        else:
251
            # Numerical feature
252
            vw_line += f"{key}:{value} "
253
    
254
    return vw_line.strip()
255

256
# Process multi-class data
257
sample_features = {"age": 25, "category": "A", "score": 0.8}
258
sample_label = 2  # Class label
259

260
vw_sample = create_vw_multiclass_sample(sample_features, sample_label)
261
prediction = autovw_classifier.predict(vw_sample)
262
autovw_classifier.learn(vw_sample)
263
```
264

265
#### Contextual Bandit Learning
266
```python
267
from flaml import AutoVW
268

269
# Search space for contextual bandits
270
search_space = {
271
    "learning_rate": {"_type": "loguniform", "_value": [0.001, 0.1]},
272
    "cb_explore_adf": {"_type": "choice", "_value": [True]},
273
    "epsilon": {"_type": "uniform", "_value": [0.01, 0.3]}
274
}
275

276
# Initialize for contextual bandit
277
autovw_cb = AutoVW(
278
    max_live_model_num=5,
279
    search_space=search_space,
280
    metric="cb_loss",
281
    model_selection_mode="min"
282
)
283

284
def create_cb_sample(context, action, cost, probability):
285
    """Create contextual bandit VW format sample."""
286
    # Format: cost:probability:action |context features
287
    vw_line = f"{cost}:{probability}:{action} |context "
288
    vw_line += " ".join([f"{k}:{v}" for k, v in context.items()])
289
    return vw_line
290

291
# Example contextual bandit interaction
292
context = {"user_age": 30, "day_of_week": 2, "weather": 1}
293
action = 1  # Action taken
294
cost = 0.5  # Cost observed (lower is better)
295
probability = 0.2  # Probability of taking this action
296

297
cb_sample = create_cb_sample(context, action, cost, probability)
298
autovw_cb.learn(cb_sample)
299

300
# For prediction, provide context without action/cost
301
prediction_context = "1 |context user_age:25 day_of_week:3 weather:0"
302
predicted_action = autovw_cb.predict(prediction_context)
303
```
304

305
## Model Selection Policies
306

307
### Available Policies
308
- **threshold_loss_ucb**: Threshold-based selection with upper confidence bounds
309
- **loss_ucb**: Loss-based selection with confidence bounds
310
- **min_loss**: Select model with minimum observed loss
311
- **random**: Random model selection (baseline)
312

313
### Metrics
314
- **mae_clipped**: Mean absolute error with clipping
315
- **mae**: Mean absolute error
316
- **mse**: Mean squared error  
317
- **absolute_loss**: Absolute loss (for classification)
318
- **squared_loss**: Squared loss
319
- **cb_loss**: Contextual bandit loss
320

321
### Advanced Trial Management
322

323
Lower-level components for managing individual Vowpal Wabbit trials and online trial execution.
324

325
```python { .api }
326
class VowpalWabbitTrial:
327
    """Individual Vowpal Wabbit trial with specific hyperparameters."""
328
    
329
    def __init__(self, config, trial_id=None):
330
        """
331
        Initialize VW trial.
332
        
333
        Args:
334
            config (dict): VW hyperparameter configuration
335
            trial_id (str): Unique trial identifier
336
        """
337
    
338
    def train_eval(self, data_sample, eval_only=False):
339
        """
340
        Train and/or evaluate on data sample.
341
        
342
        Args:
343
            data_sample (str): VW-formatted data sample
344
            eval_only (bool): Only evaluate without training
345
            
346
        Returns:
347
            dict: Performance metrics
348
        """
349
    
350
    def predict(self, data_sample):
351
        """Make prediction on data sample."""
352
    
353
    @property
354
    def config(self):
355
        """dict: Trial configuration"""
356
    
357
    @property
358
    def trial_id(self):
359
        """str: Trial identifier"""
360

361
class OnlineTrialRunner:
362
    """Manager for running multiple online learning trials."""
363
    
364
    def __init__(self, search_space, max_live_model_num=5, **kwargs):
365
        """
366
        Initialize online trial runner.
367
        
368
        Args:
369
            search_space (dict): Hyperparameter search space
370
            max_live_model_num (int): Maximum concurrent models
371
            **kwargs: Additional configuration
372
        """
373
    
374
    def step(self, data_sample):
375
        """
376
        Process one data sample across all active trials.
377
        
378
        Args:
379
            data_sample (str): VW-formatted data sample
380
            
381
        Returns:
382
            dict: Aggregated results from all trials
383
        """
384
    
385
    def get_best_trial(self):
386
        """Get currently best performing trial."""
387
    
388
    def suggest_trial(self):
389
        """Suggest new trial configuration."""
390
    
391
    def remove_trial(self, trial_id):
392
        """Remove trial from active set."""
393
```
394

395
## Integration Features
396

397
- **Vowpal Wabbit Backend**: Leverages VW's efficient online learning algorithms
398
- **Multi-Model Management**: Maintains multiple models with different hyperparameters
399
- **Adaptive Selection**: Dynamic model selection based on performance
400
- **Resource Management**: Intelligent allocation of computational resources
401
- **Streaming Data Support**: Designed for continuous data streams
402
- **Multiple Task Support**: Regression, classification, contextual bandits
403
- **Hyperparameter Optimization**: Automated search over hyperparameter space

Version

Tile

Files

online-learning.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

online-learning.mddocs/