or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

autogen.mdautoml.mddefault-estimators.mdindex.mdonline-learning.mdtuning.md
tile.json

tessl/pypi-flaml

A fast library for automated machine learning and tuning

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/flaml@2.3.x

To install, run

npx @tessl/cli install tessl/pypi-flaml@2.3.0

index.mddocs/

FLAML

FLAML (Fast Library for Automated Machine Learning and Tuning) is a lightweight Python library that automates machine learning and AI operations while optimizing their performance. It enables building next-generation GPT applications based on multi-agent conversations, provides fast and economical automatic tuning, and quickly finds quality models for common machine learning tasks with minimal effort.

Package Information

  • Package Name: FLAML
  • Language: Python
  • Installation: pip install FLAML

For full AutoML functionality:

pip install FLAML[automl]

For multi-agent conversations:

pip install FLAML[autogen]

Core Imports

from flaml import AutoML, AutoVW

For hyperparameter tuning:

from flaml.tune import run
from flaml.tune.searcher import BlendSearch, CFO, FLOW2

For multi-agent conversations:

from flaml.autogen import AssistantAgent, UserProxyAgent, GroupChat

For configuration constants:

from flaml.config import N_SPLITS, RANDOM_SEED, MEM_THRES

For enhanced estimators:

from flaml.default import LGBMRegressor, XGBClassifier, suggest_hyperparams

Basic Usage

Automated Machine Learning

from flaml import AutoML
import pandas as pd
from sklearn.model_selection import train_test_split

# Load your data
X, y = load_data()  # your dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and configure AutoML
automl = AutoML()
automl_settings = {
    "time_budget": 60,  # seconds
    "metric": "accuracy",
    "task": "classification",
    "verbose": 0
}

# Train the model
automl.fit(X_train, y_train, **automl_settings)

# Make predictions
predictions = automl.predict(X_test)
probabilities = automl.predict_proba(X_test)

print(f"Best model: {automl.best_estimator}")
print(f"Best config: {automl.best_config}")
print(f"Accuracy: {automl.score(X_test, y_test)}")

Hyperparameter Tuning

from flaml.tune import run
from flaml.tune.searcher import BlendSearch

def train_model(config):
    # Your training function
    model = SomeModel(**config)
    score = model.train_and_evaluate()
    return {"score": score}

# Define search space
search_space = {
    "learning_rate": {"_type": "loguniform", "_value": [0.001, 0.1]},
    "n_estimators": {"_type": "randint", "_value": [10, 100]}
}

# Run hyperparameter optimization
analysis = run(
    train_model,
    search_space,
    searcher=BlendSearch(metric="score", mode="max"),
    time_budget_s=300
)

best_config = analysis.best_config

Multi-Agent Conversations

from flaml.autogen import AssistantAgent, UserProxyAgent

# Create agents
assistant = AssistantAgent(
    name="assistant",
    llm_config={"model": "gpt-4", "api_key": "your-api-key"}
)

user_proxy = UserProxyAgent(
    name="user_proxy",
    human_input_mode="NEVER",
    code_execution_config={"work_dir": "coding"}
)

# Start conversation
user_proxy.initiate_chat(
    assistant,
    message="Help me create a Python function to calculate fibonacci numbers."
)

Architecture

FLAML consists of four main components:

  • AutoML Engine: Automated machine learning with intelligent model selection and hyperparameter optimization
  • Hyperparameter Tuning Framework: Advanced search algorithms (BlendSearch, FLOW2, CFO) for efficient optimization
  • Multi-Agent Framework: Conversational AI agents for collaborative problem-solving and code generation
  • Online Learning System: Continuous learning with AutoVW for streaming data scenarios

These components work independently or together, enabling flexible integration into various machine learning workflows from research prototypes to production systems.

Capabilities

Automated Machine Learning

Complete automated machine learning pipeline supporting classification, regression, forecasting, ranking, and NLP tasks with intelligent model selection, hyperparameter optimization, and ensemble methods.

class AutoML:
    def fit(self, X_train, y_train, task="classification", time_budget=60, **kwargs): ...
    def predict(self, X, **kwargs): ...
    def predict_proba(self, X, **kwargs): ...
    def score(self, X, y, **kwargs): ...
    
    @property
    def best_estimator(self): ...
    @property  
    def best_config(self): ...
    @property
    def best_loss(self): ...

Automated Machine Learning

Hyperparameter Tuning

Advanced hyperparameter optimization with multiple search algorithms, search space definitions, and integration with popular ML frameworks including Ray Tune compatibility.

def run(trainable, search_space, searcher=None, time_budget_s=None, **kwargs): ...

class BlendSearch:
    def __init__(self, metric, mode, space=None, **kwargs): ...
    def suggest(self, trial_id): ...
    def on_trial_result(self, trial_id, result): ...

# Search space functions
def uniform(low, high): ...
def loguniform(low, high): ...
def randint(low, high): ...
def choice(categories): ...

Hyperparameter Tuning

Multi-Agent Conversations

Framework for building conversational AI applications with multiple agents, supporting code execution, human interaction, group conversations, and integration with various language models.

class ConversableAgent:
    def __init__(self, name, system_message=None, llm_config=None, **kwargs): ...
    def send(self, message, recipient, request_reply=True): ...
    def receive(self, message, sender, request_reply=None): ...
    def register_reply(self, trigger, reply_func, **kwargs): ...

class AssistantAgent(ConversableAgent): ...
class UserProxyAgent(ConversableAgent): ...

class GroupChat:
    def __init__(self, agents, messages=[], max_round=10): ...

Multi-Agent Conversations

Online Learning

Automated online learning system using Vowpal Wabbit with multiple model management, adaptive resource allocation, and real-time model selection for streaming data scenarios.

class AutoVW:
    def __init__(self, max_live_model_num, search_space, **kwargs): ...
    def predict(self, data_sample): ...
    def learn(self, data_sample): ...

Online Learning

Default Estimators and Hyperparameter Suggestions

Enhanced versions of popular machine learning estimators with optimized hyperparameters and intelligent hyperparameter suggestion functions based on dataset characteristics.

# Enhanced estimators with optimized hyperparameters
class LGBMClassifier: ...
class LGBMRegressor: ...
class XGBClassifier: ...
class XGBRegressor: ...
class RandomForestClassifier: ...
class RandomForestRegressor: ...
class ExtraTreesClassifier: ...
class ExtraTreesRegressor: ...

# Hyperparameter suggestion functions
def suggest_hyperparams(estimator_name, X, y, task="classification"): ...
def suggest_learner(X, y, task="classification"): ...
def suggest_config(estimator_name, X, y, task="classification", time_budget=60): ...
def flamlize_estimator(estimator_class, task="classification", **kwargs): ...

Default Estimators

Search Algorithms

class BlendSearch:
    """Blended search combining local and global search strategies"""

class CFO:
    """Cost-Frugal Optimization for efficient hyperparameter tuning"""

class FLOW2:
    """Fast local search algorithm with adaptive step sizes"""

class RandomSearch:
    """Random sampling baseline for hyperparameter optimization"""

Configuration Constants

Default configuration values used throughout FLAML for consistent behavior across different components.

# Cross-validation and data splitting
N_SPLITS = 5                    # Default number of cross-validation folds
SPLIT_RATIO = 0.1              # Default validation split ratio
CV_HOLDOUT_THRESHOLD = 100000   # Threshold for switching from CV to holdout

# Memory and performance thresholds  
MEM_THRES = 4 * (1024**3)      # Memory threshold (4GB)
SMALL_LARGE_THRES = 10000000   # Threshold for small vs large datasets
MIN_SAMPLE_TRAIN = 10000       # Minimum samples for training

# Optimization parameters
RANDOM_SEED = 1                # Default random seed
SAMPLE_MULTIPLY_FACTOR = 4     # Sample multiplication factor
SEARCH_THREAD_EPS = 1.0        # Search thread epsilon
PENALTY = 1e10                 # Penalty term for constraints

Utility Functions

Additional utility functions available in FLAML modules.

# AutoML utilities
def size(X):
    """
    Calculate memory size of dataset.
    
    Args:
        X: Dataset or array-like object
        
    Returns:
        int: Memory size in bytes
    """

# Tune utilities  
INCUMBENT_RESULT = "INCUMBENT_RESULT"  # Constant for incumbent results

class Trial:
    """Trial management class for hyperparameter tuning experiments."""
    def __init__(self, config, trial_id=None): ...
    @property
    def config(self): ...
    @property 
    def trial_id(self): ...

# AutoGen model constants
DEFAULT_MODEL = "gpt-4"         # Default language model
FAST_MODEL = "gpt-3.5-turbo"   # Fast language model

Integration Features

FLAML integrates seamlessly with the Python machine learning ecosystem:

  • scikit-learn: Compatible estimator interface with fit/predict methods
  • XGBoost & LightGBM: Enhanced versions with optimized hyperparameters
  • Ray Tune: Distributed hyperparameter tuning support
  • MLflow: Experiment tracking and model logging
  • Spark: Distributed training for large datasets
  • OpenAI: Language model integration for conversational agents