A fast library for automated machine learning and tuning
npx @tessl/cli install tessl/pypi-flaml@2.3.00
# FLAML
1
2
FLAML (Fast Library for Automated Machine Learning and Tuning) is a lightweight Python library that automates machine learning and AI operations while optimizing their performance. It enables building next-generation GPT applications based on multi-agent conversations, provides fast and economical automatic tuning, and quickly finds quality models for common machine learning tasks with minimal effort.
3
4
## Package Information
5
6
- **Package Name**: FLAML
7
- **Language**: Python
8
- **Installation**: `pip install FLAML`
9
10
For full AutoML functionality:
11
```bash
12
pip install FLAML[automl]
13
```
14
15
For multi-agent conversations:
16
```bash
17
pip install FLAML[autogen]
18
```
19
20
## Core Imports
21
22
```python
23
from flaml import AutoML, AutoVW
24
```
25
26
For hyperparameter tuning:
27
```python
28
from flaml.tune import run
29
from flaml.tune.searcher import BlendSearch, CFO, FLOW2
30
```
31
32
For multi-agent conversations:
33
```python
34
from flaml.autogen import AssistantAgent, UserProxyAgent, GroupChat
35
```
36
37
For configuration constants:
38
```python
39
from flaml.config import N_SPLITS, RANDOM_SEED, MEM_THRES
40
```
41
42
For enhanced estimators:
43
```python
44
from flaml.default import LGBMRegressor, XGBClassifier, suggest_hyperparams
45
```
46
47
## Basic Usage
48
49
### Automated Machine Learning
50
```python
51
from flaml import AutoML
52
import pandas as pd
53
from sklearn.model_selection import train_test_split
54
55
# Load your data
56
X, y = load_data() # your dataset
57
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
58
59
# Create and configure AutoML
60
automl = AutoML()
61
automl_settings = {
62
"time_budget": 60, # seconds
63
"metric": "accuracy",
64
"task": "classification",
65
"verbose": 0
66
}
67
68
# Train the model
69
automl.fit(X_train, y_train, **automl_settings)
70
71
# Make predictions
72
predictions = automl.predict(X_test)
73
probabilities = automl.predict_proba(X_test)
74
75
print(f"Best model: {automl.best_estimator}")
76
print(f"Best config: {automl.best_config}")
77
print(f"Accuracy: {automl.score(X_test, y_test)}")
78
```
79
80
### Hyperparameter Tuning
81
```python
82
from flaml.tune import run
83
from flaml.tune.searcher import BlendSearch
84
85
def train_model(config):
86
# Your training function
87
model = SomeModel(**config)
88
score = model.train_and_evaluate()
89
return {"score": score}
90
91
# Define search space
92
search_space = {
93
"learning_rate": {"_type": "loguniform", "_value": [0.001, 0.1]},
94
"n_estimators": {"_type": "randint", "_value": [10, 100]}
95
}
96
97
# Run hyperparameter optimization
98
analysis = run(
99
train_model,
100
search_space,
101
searcher=BlendSearch(metric="score", mode="max"),
102
time_budget_s=300
103
)
104
105
best_config = analysis.best_config
106
```
107
108
### Multi-Agent Conversations
109
```python
110
from flaml.autogen import AssistantAgent, UserProxyAgent
111
112
# Create agents
113
assistant = AssistantAgent(
114
name="assistant",
115
llm_config={"model": "gpt-4", "api_key": "your-api-key"}
116
)
117
118
user_proxy = UserProxyAgent(
119
name="user_proxy",
120
human_input_mode="NEVER",
121
code_execution_config={"work_dir": "coding"}
122
)
123
124
# Start conversation
125
user_proxy.initiate_chat(
126
assistant,
127
message="Help me create a Python function to calculate fibonacci numbers."
128
)
129
```
130
131
## Architecture
132
133
FLAML consists of four main components:
134
135
- **AutoML Engine**: Automated machine learning with intelligent model selection and hyperparameter optimization
136
- **Hyperparameter Tuning Framework**: Advanced search algorithms (BlendSearch, FLOW2, CFO) for efficient optimization
137
- **Multi-Agent Framework**: Conversational AI agents for collaborative problem-solving and code generation
138
- **Online Learning System**: Continuous learning with AutoVW for streaming data scenarios
139
140
These components work independently or together, enabling flexible integration into various machine learning workflows from research prototypes to production systems.
141
142
## Capabilities
143
144
### Automated Machine Learning
145
146
Complete automated machine learning pipeline supporting classification, regression, forecasting, ranking, and NLP tasks with intelligent model selection, hyperparameter optimization, and ensemble methods.
147
148
```python { .api }
149
class AutoML:
150
def fit(self, X_train, y_train, task="classification", time_budget=60, **kwargs): ...
151
def predict(self, X, **kwargs): ...
152
def predict_proba(self, X, **kwargs): ...
153
def score(self, X, y, **kwargs): ...
154
155
@property
156
def best_estimator(self): ...
157
@property
158
def best_config(self): ...
159
@property
160
def best_loss(self): ...
161
```
162
163
[Automated Machine Learning](./automl.md)
164
165
### Hyperparameter Tuning
166
167
Advanced hyperparameter optimization with multiple search algorithms, search space definitions, and integration with popular ML frameworks including Ray Tune compatibility.
168
169
```python { .api }
170
def run(trainable, search_space, searcher=None, time_budget_s=None, **kwargs): ...
171
172
class BlendSearch:
173
def __init__(self, metric, mode, space=None, **kwargs): ...
174
def suggest(self, trial_id): ...
175
def on_trial_result(self, trial_id, result): ...
176
177
# Search space functions
178
def uniform(low, high): ...
179
def loguniform(low, high): ...
180
def randint(low, high): ...
181
def choice(categories): ...
182
```
183
184
[Hyperparameter Tuning](./tuning.md)
185
186
### Multi-Agent Conversations
187
188
Framework for building conversational AI applications with multiple agents, supporting code execution, human interaction, group conversations, and integration with various language models.
189
190
```python { .api }
191
class ConversableAgent:
192
def __init__(self, name, system_message=None, llm_config=None, **kwargs): ...
193
def send(self, message, recipient, request_reply=True): ...
194
def receive(self, message, sender, request_reply=None): ...
195
def register_reply(self, trigger, reply_func, **kwargs): ...
196
197
class AssistantAgent(ConversableAgent): ...
198
class UserProxyAgent(ConversableAgent): ...
199
200
class GroupChat:
201
def __init__(self, agents, messages=[], max_round=10): ...
202
```
203
204
[Multi-Agent Conversations](./autogen.md)
205
206
### Online Learning
207
208
Automated online learning system using Vowpal Wabbit with multiple model management, adaptive resource allocation, and real-time model selection for streaming data scenarios.
209
210
```python { .api }
211
class AutoVW:
212
def __init__(self, max_live_model_num, search_space, **kwargs): ...
213
def predict(self, data_sample): ...
214
def learn(self, data_sample): ...
215
```
216
217
[Online Learning](./online-learning.md)
218
219
### Default Estimators and Hyperparameter Suggestions
220
221
Enhanced versions of popular machine learning estimators with optimized hyperparameters and intelligent hyperparameter suggestion functions based on dataset characteristics.
222
223
```python { .api }
224
# Enhanced estimators with optimized hyperparameters
225
class LGBMClassifier: ...
226
class LGBMRegressor: ...
227
class XGBClassifier: ...
228
class XGBRegressor: ...
229
class RandomForestClassifier: ...
230
class RandomForestRegressor: ...
231
class ExtraTreesClassifier: ...
232
class ExtraTreesRegressor: ...
233
234
# Hyperparameter suggestion functions
235
def suggest_hyperparams(estimator_name, X, y, task="classification"): ...
236
def suggest_learner(X, y, task="classification"): ...
237
def suggest_config(estimator_name, X, y, task="classification", time_budget=60): ...
238
def flamlize_estimator(estimator_class, task="classification", **kwargs): ...
239
```
240
241
[Default Estimators](./default-estimators.md)
242
243
## Search Algorithms
244
245
```python { .api }
246
class BlendSearch:
247
"""Blended search combining local and global search strategies"""
248
249
class CFO:
250
"""Cost-Frugal Optimization for efficient hyperparameter tuning"""
251
252
class FLOW2:
253
"""Fast local search algorithm with adaptive step sizes"""
254
255
class RandomSearch:
256
"""Random sampling baseline for hyperparameter optimization"""
257
```
258
259
## Configuration Constants
260
261
Default configuration values used throughout FLAML for consistent behavior across different components.
262
263
```python { .api }
264
# Cross-validation and data splitting
265
N_SPLITS = 5 # Default number of cross-validation folds
266
SPLIT_RATIO = 0.1 # Default validation split ratio
267
CV_HOLDOUT_THRESHOLD = 100000 # Threshold for switching from CV to holdout
268
269
# Memory and performance thresholds
270
MEM_THRES = 4 * (1024**3) # Memory threshold (4GB)
271
SMALL_LARGE_THRES = 10000000 # Threshold for small vs large datasets
272
MIN_SAMPLE_TRAIN = 10000 # Minimum samples for training
273
274
# Optimization parameters
275
RANDOM_SEED = 1 # Default random seed
276
SAMPLE_MULTIPLY_FACTOR = 4 # Sample multiplication factor
277
SEARCH_THREAD_EPS = 1.0 # Search thread epsilon
278
PENALTY = 1e10 # Penalty term for constraints
279
```
280
281
## Utility Functions
282
283
Additional utility functions available in FLAML modules.
284
285
```python { .api }
286
# AutoML utilities
287
def size(X):
288
"""
289
Calculate memory size of dataset.
290
291
Args:
292
X: Dataset or array-like object
293
294
Returns:
295
int: Memory size in bytes
296
"""
297
298
# Tune utilities
299
INCUMBENT_RESULT = "INCUMBENT_RESULT" # Constant for incumbent results
300
301
class Trial:
302
"""Trial management class for hyperparameter tuning experiments."""
303
def __init__(self, config, trial_id=None): ...
304
@property
305
def config(self): ...
306
@property
307
def trial_id(self): ...
308
309
# AutoGen model constants
310
DEFAULT_MODEL = "gpt-4" # Default language model
311
FAST_MODEL = "gpt-3.5-turbo" # Fast language model
312
```
313
314
## Integration Features
315
316
FLAML integrates seamlessly with the Python machine learning ecosystem:
317
318
- **scikit-learn**: Compatible estimator interface with fit/predict methods
319
- **XGBoost & LightGBM**: Enhanced versions with optimized hyperparameters
320
- **Ray Tune**: Distributed hyperparameter tuning support
321
- **MLflow**: Experiment tracking and model logging
322
- **Spark**: Distributed training for large datasets
323
- **OpenAI**: Language model integration for conversational agents