or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-training.md distributed-computing.md index.md sklearn-interface.md training-callbacks.md visualization.md

tile.json

tessl/pypi-lightgbm

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/lightgbm@4.6.x

To install, run

npx @tessl/cli install tessl/pypi-lightgbm@4.6.0

LightGBM

LightGBM is a gradient boosting framework that uses tree-based learning algorithms, designed to be distributed and efficient with faster training speed, higher efficiency, lower memory usage, better accuracy, and support for parallel, distributed, and GPU learning. It provides a comprehensive machine learning library for gradient boosting with capabilities for handling large-scale data, featuring a scikit-learn compatible API, support for various data formats including pandas DataFrames and NumPy arrays, advanced hyperparameter tuning integration, and cross-platform compatibility.

Package Information

Package Name: lightgbm
Language: Python
Installation: pip install lightgbm
Optional Dependencies:
- Dask: pip install lightgbm[dask]
- Pandas: pip install lightgbm[pandas]
- Scikit-learn: pip install lightgbm[scikit-learn]
- Arrow: pip install lightgbm[arrow]

Core Imports

import lightgbm as lgb

Import specific components:

from lightgbm import (
    LGBMRegressor, LGBMClassifier, LGBMRanker,  # Scikit-learn interface
    Booster, Dataset,  # Core components
    train, cv,  # Training functions
    plot_importance, plot_tree  # Visualization
)

Basic Usage

import lightgbm as lgb
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

# Load data
data = load_breast_cancer()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)

# Method 1: Using scikit-learn interface (recommended for most users)
model = lgb.LGBMClassifier(
    objective='binary',
    num_leaves=31,
    learning_rate=0.05,
    feature_fraction=0.9
)
model.fit(X_train, y_train)
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)

# Method 2: Using native LightGBM interface (for advanced control)
train_data = lgb.Dataset(X_train, label=y_train)
params = {
    'objective': 'binary',
    'metric': 'binary_logloss',
    'boosting_type': 'gbdt',
    'num_leaves': 31,
    'learning_rate': 0.05,
    'feature_fraction': 0.9
}
model = lgb.train(params, train_data, num_boost_round=100)
predictions = model.predict(X_test)

Architecture

LightGBM's architecture provides flexibility through multiple interfaces:

Core Components: Booster and Dataset provide low-level model control and efficient data handling
Scikit-learn Interface: LGBMRegressor, LGBMClassifier, LGBMRanker offer familiar sklearn-compatible APIs
Training Functions: train() and cv() enable direct model training and cross-validation
Distributed Computing: Dask integration enables scalable training across multiple machines
Visualization: Built-in plotting functions for model interpretation and analysis
Callbacks: Extensible training control with early stopping, logging, and custom callbacks

This design enables LightGBM to serve both as a high-performance gradient boosting engine and a comprehensive machine learning framework suitable for production environments.

Capabilities

Scikit-learn Compatible Models

High-level, sklearn-compatible interface for regression, classification, and ranking tasks. Provides familiar .fit(), .predict(), and .score() methods with automatic hyperparameter handling and feature processing.

class LGBMRegressor:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...
    def score(self, X, y, **kwargs): ...

class LGBMClassifier:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...
    def predict_proba(self, X, **kwargs): ...
    def score(self, X, y, **kwargs): ...

class LGBMRanker:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...
    def score(self, X, y, **kwargs): ...

Scikit-learn Interface

Core Model Training

Low-level LightGBM interface providing direct access to the gradient boosting engine. Enables advanced model control, custom objectives, evaluation functions, and fine-tuned training procedures.

class Booster:
    def __init__(self, params, train_set, **kwargs): ...
    def predict(self, data, **kwargs): ...
    def update(self, train_set, fobj): ...
    def feature_importance(self, importance_type='split'): ...
    def save_model(self, filename): ...

class Dataset:
    def __init__(self, data, label=None, **kwargs): ...
    def construct(): ...
    def create_valid(data, **kwargs): ...
    def set_field(field_name, data): ...

def train(params, train_set, **kwargs): ...
def cv(params, train_set, **kwargs): ...

Core Training

Distributed Computing

Distributed training and prediction using Dask for scalable machine learning across multiple machines. Provides all the functionality of standard LightGBM models with automatic data distribution and parallel processing.

class DaskLGBMRegressor:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...

class DaskLGBMClassifier:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...
    def predict_proba(self, X, **kwargs): ...

class DaskLGBMRanker:
    def fit(self, X, y, **kwargs): ...
    def predict(self, X, **kwargs): ...

Distributed Computing

Visualization and Model Interpretation

Built-in plotting functions for model interpretation, feature importance analysis, training progress monitoring, and tree structure visualization. Supports both matplotlib and graphviz backends.

def plot_importance(booster, **kwargs): ...
def plot_metric(eval_result, **kwargs): ...
def plot_tree(booster, **kwargs): ...
def plot_split_value_histogram(booster, **kwargs): ...
def create_tree_digraph(booster, **kwargs): ...

Visualization

Training Control and Callbacks

Flexible training control through callback functions enabling early stopping, evaluation logging, parameter adjustment, and custom training behaviors. Supports both built-in and custom callback implementations.

def early_stopping(stopping_rounds, **kwargs): ...
def log_evaluation(period=1, **kwargs): ...
def record_evaluation(eval_result): ...
def reset_parameter(**kwargs): ...

class EarlyStopException(Exception): ...

Training Callbacks

Version

Tile

Files

tessl/pypi-lightgbm

To install, run

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

LightGBM

Package Information

Core Imports

Basic Usage

Architecture

Capabilities

Scikit-learn Compatible Models

Core Model Training

Distributed Computing

Visualization and Model Interpretation

Training Control and Callbacks

index.mddocs/