Tessl Tile for pypi/lightgbm@4.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-training.md distributed-computing.md index.md sklearn-interface.md training-callbacks.md visualization.md

index.mddocs/

0
# LightGBM
1

2
LightGBM is a gradient boosting framework that uses tree-based learning algorithms, designed to be distributed and efficient with faster training speed, higher efficiency, lower memory usage, better accuracy, and support for parallel, distributed, and GPU learning. It provides a comprehensive machine learning library for gradient boosting with capabilities for handling large-scale data, featuring a scikit-learn compatible API, support for various data formats including pandas DataFrames and NumPy arrays, advanced hyperparameter tuning integration, and cross-platform compatibility.
3

4
## Package Information
5

6
- **Package Name**: lightgbm
7
- **Language**: Python
8
- **Installation**: `pip install lightgbm`
9
- **Optional Dependencies**: 
10
  - Dask: `pip install lightgbm[dask]`
11
  - Pandas: `pip install lightgbm[pandas]`
12
  - Scikit-learn: `pip install lightgbm[scikit-learn]`
13
  - Arrow: `pip install lightgbm[arrow]`
14

15
## Core Imports
16

17
```python
18
import lightgbm as lgb
19
```
20

21
Import specific components:
22

23
```python
24
from lightgbm import (
25
    LGBMRegressor, LGBMClassifier, LGBMRanker,  # Scikit-learn interface
26
    Booster, Dataset,  # Core components
27
    train, cv,  # Training functions
28
    plot_importance, plot_tree  # Visualization
29
)
30
```
31

32
## Basic Usage
33

34
```python
35
import lightgbm as lgb
36
import numpy as np
37
from sklearn.datasets import load_breast_cancer
38
from sklearn.model_selection import train_test_split
39

40
# Load data
41
data = load_breast_cancer()
42
X_train, X_test, y_train, y_test = train_test_split(
43
    data.data, data.target, test_size=0.2, random_state=42
44
)
45

46
# Method 1: Using scikit-learn interface (recommended for most users)
47
model = lgb.LGBMClassifier(
48
    objective='binary',
49
    num_leaves=31,
50
    learning_rate=0.05,
51
    feature_fraction=0.9
52
)
53
model.fit(X_train, y_train)
54
predictions = model.predict(X_test)
55
probabilities = model.predict_proba(X_test)
56

57
# Method 2: Using native LightGBM interface (for advanced control)
58
train_data = lgb.Dataset(X_train, label=y_train)
59
params = {
60
    'objective': 'binary',
61
    'metric': 'binary_logloss',
62
    'boosting_type': 'gbdt',
63
    'num_leaves': 31,
64
    'learning_rate': 0.05,
65
    'feature_fraction': 0.9
66
}
67
model = lgb.train(params, train_data, num_boost_round=100)
68
predictions = model.predict(X_test)
69
```
70

71
## Architecture
72

73
LightGBM's architecture provides flexibility through multiple interfaces:
74

75
- **Core Components**: `Booster` and `Dataset` provide low-level model control and efficient data handling
76
- **Scikit-learn Interface**: `LGBMRegressor`, `LGBMClassifier`, `LGBMRanker` offer familiar sklearn-compatible APIs
77
- **Training Functions**: `train()` and `cv()` enable direct model training and cross-validation
78
- **Distributed Computing**: Dask integration enables scalable training across multiple machines
79
- **Visualization**: Built-in plotting functions for model interpretation and analysis
80
- **Callbacks**: Extensible training control with early stopping, logging, and custom callbacks
81

82
This design enables LightGBM to serve both as a high-performance gradient boosting engine and a comprehensive machine learning framework suitable for production environments.
83

84
## Capabilities
85

86
### Scikit-learn Compatible Models
87

88
High-level, sklearn-compatible interface for regression, classification, and ranking tasks. Provides familiar `.fit()`, `.predict()`, and `.score()` methods with automatic hyperparameter handling and feature processing.
89

90
```python { .api }
91
class LGBMRegressor:
92
    def fit(self, X, y, **kwargs): ...
93
    def predict(self, X, **kwargs): ...
94
    def score(self, X, y, **kwargs): ...
95

96
class LGBMClassifier:
97
    def fit(self, X, y, **kwargs): ...
98
    def predict(self, X, **kwargs): ...
99
    def predict_proba(self, X, **kwargs): ...
100
    def score(self, X, y, **kwargs): ...
101

102
class LGBMRanker:
103
    def fit(self, X, y, **kwargs): ...
104
    def predict(self, X, **kwargs): ...
105
    def score(self, X, y, **kwargs): ...
106
```
107

108
[Scikit-learn Interface](./sklearn-interface.md)
109

110
### Core Model Training
111

112
Low-level LightGBM interface providing direct access to the gradient boosting engine. Enables advanced model control, custom objectives, evaluation functions, and fine-tuned training procedures.
113

114
```python { .api }
115
class Booster:
116
    def __init__(self, params, train_set, **kwargs): ...
117
    def predict(self, data, **kwargs): ...
118
    def update(self, train_set, fobj): ...
119
    def feature_importance(self, importance_type='split'): ...
120
    def save_model(self, filename): ...
121

122
class Dataset:
123
    def __init__(self, data, label=None, **kwargs): ...
124
    def construct(): ...
125
    def create_valid(data, **kwargs): ...
126
    def set_field(field_name, data): ...
127

128
def train(params, train_set, **kwargs): ...
129
def cv(params, train_set, **kwargs): ...
130
```
131

132
[Core Training](./core-training.md)
133

134
### Distributed Computing
135

136
Distributed training and prediction using Dask for scalable machine learning across multiple machines. Provides all the functionality of standard LightGBM models with automatic data distribution and parallel processing.
137

138
```python { .api }
139
class DaskLGBMRegressor:
140
    def fit(self, X, y, **kwargs): ...
141
    def predict(self, X, **kwargs): ...
142

143
class DaskLGBMClassifier:
144
    def fit(self, X, y, **kwargs): ...
145
    def predict(self, X, **kwargs): ...
146
    def predict_proba(self, X, **kwargs): ...
147

148
class DaskLGBMRanker:
149
    def fit(self, X, y, **kwargs): ...
150
    def predict(self, X, **kwargs): ...
151
```
152

153
[Distributed Computing](./distributed-computing.md)
154

155
### Visualization and Model Interpretation
156

157
Built-in plotting functions for model interpretation, feature importance analysis, training progress monitoring, and tree structure visualization. Supports both matplotlib and graphviz backends.
158

159
```python { .api }
160
def plot_importance(booster, **kwargs): ...
161
def plot_metric(eval_result, **kwargs): ...
162
def plot_tree(booster, **kwargs): ...
163
def plot_split_value_histogram(booster, **kwargs): ...
164
def create_tree_digraph(booster, **kwargs): ...
165
```
166

167
[Visualization](./visualization.md)
168

169
### Training Control and Callbacks
170

171
Flexible training control through callback functions enabling early stopping, evaluation logging, parameter adjustment, and custom training behaviors. Supports both built-in and custom callback implementations.
172

173
```python { .api }
174
def early_stopping(stopping_rounds, **kwargs): ...
175
def log_evaluation(period=1, **kwargs): ...
176
def record_evaluation(eval_result): ...
177
def reset_parameter(**kwargs): ...
178

179
class EarlyStopException(Exception): ...
180
```
181

182
[Training Callbacks](./training-callbacks.md)

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/