Tessl Tile for pypi/xgboost@3.0.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-xgboost

XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable

Workspace: tessl
Visibility: Public
Created: 2 months ago
Last updated: 2 months ago
Describes: pkg:pypi/xgboost@3.0.x

To install, run

npx @tessl/cli install tessl/pypi-xgboost@3.0.0

0
# XGBoost
1

2
XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible, and portable. It implements machine learning algorithms under the Gradient Boosting framework, providing parallel tree boosting (GBDT, GBM) that solves data science problems in a fast and accurate way. The library runs on major distributed environments and can handle problems beyond billions of examples.
3

4
## Package Information
5

6
- **Package Name**: xgboost
7
- **Language**: Python
8
- **Installation**: `pip install xgboost`
9

10
## Core Imports
11

12
```python
13
import xgboost as xgb
14
```
15

16
For scikit-learn compatible estimators:
17

18
```python
19
from xgboost import XGBClassifier, XGBRegressor, XGBRanker
20
```
21

22
For core functionality:
23

24
```python
25
from xgboost import DMatrix, Booster, train, cv
26
```
27

28
## Basic Usage
29

30
```python
31
import xgboost as xgb
32
import numpy as np
33
from sklearn.datasets import load_boston
34
from sklearn.model_selection import train_test_split
35

36
# Load sample data
37
X, y = load_boston(return_X_y=True)
38
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
39

40
# Method 1: Using XGBoost native API
41
dtrain = xgb.DMatrix(X_train, label=y_train)
42
dtest = xgb.DMatrix(X_test, label=y_test)
43

44
params = {
45
    'objective': 'reg:squarederror',
46
    'max_depth': 3,
47
    'learning_rate': 0.1,
48
    'n_estimators': 100
49
}
50

51
model = xgb.train(params, dtrain, num_boost_round=100)
52
predictions = model.predict(dtest)
53

54
# Method 2: Using scikit-learn API
55
model = xgb.XGBRegressor(max_depth=3, learning_rate=0.1, n_estimators=100)
56
model.fit(X_train, y_train)
57
predictions = model.predict(X_test)
58
```
59

60
## Architecture
61

62
XGBoost provides multiple interfaces for different use cases:
63

64
- **Core API**: Native XGBoost interface with DMatrix for data and Booster for models
65
- **Scikit-Learn API**: Drop-in replacement estimators compatible with sklearn pipelines
66
- **Distributed Computing**: Integration with Dask, Spark, and collective communication
67
- **Specialized Features**: Quantile regression, ranking, federated learning
68

69
The library is built around efficient gradient boosting with optimizations for speed, memory usage, and scalability across different computing environments.
70

71
## Capabilities
72

73
### Core Data Structures and Training
74

75
Fundamental XGBoost data structures and training functions that form the core of the library. Includes DMatrix for efficient data handling and training functions for model creation.
76

77
```python { .api }
78
class DMatrix:
79
    def __init__(self, data, label=None, **kwargs): ...
80

81
class Booster:
82
    def predict(self, data, **kwargs): ...
83
    def save_model(self, fname): ...
84

85
def train(params, dtrain, num_boost_round=10, **kwargs): ...
86
def cv(params, dtrain, num_boost_round=10, **kwargs): ...
87
```
88

89
[Core API](./core-api.md)
90

91
### Scikit-Learn Compatible Estimators
92

93
Drop-in replacement estimators that follow scikit-learn conventions for seamless integration with existing ML pipelines. Includes classifiers, regressors, and rankers.
94

95
```python { .api }
96
class XGBRegressor:
97
    def fit(self, X, y, **kwargs): ...
98
    def predict(self, X): ...
99

100
class XGBClassifier:
101
    def fit(self, X, y, **kwargs): ...
102
    def predict(self, X): ...
103
    def predict_proba(self, X): ...
104

105
class XGBRanker:
106
    def fit(self, X, y, **kwargs): ...
107
    def predict(self, X): ...
108
```
109

110
[Scikit-Learn Interface](./sklearn-interface.md)
111

112
### Distributed Computing
113

114
Distributed training and prediction capabilities for large-scale machine learning across multiple workers and computing environments.
115

116
```python { .api }
117
# Dask integration
118
from xgboost.dask import DaskXGBRegressor, DaskXGBClassifier
119

120
# Spark integration  
121
from xgboost.spark import SparkXGBRegressor, SparkXGBClassifier
122

123
# Collective communication
124
import xgboost.collective as collective
125
```
126

127
[Distributed Computing](./distributed-computing.md)
128

129
### Visualization and Model Interpretation
130

131
Tools for visualizing model structure, feature importance, and decision trees to understand and interpret XGBoost models.
132

133
```python { .api }
134
def plot_importance(booster, **kwargs): ...
135
def plot_tree(booster, **kwargs): ...
136
def to_graphviz(booster, **kwargs): ...
137
```
138

139
[Visualization](./visualization.md)
140

141
### Training Callbacks
142

143
Comprehensive callback system for monitoring and controlling the training process, including early stopping, learning rate scheduling, and model checkpointing.
144

145
```python { .api }
146
from xgboost.callback import (
147
    TrainingCallback,
148
    EarlyStopping,
149
    LearningRateScheduler,
150
    EvaluationMonitor,
151
    TrainingCheckPoint
152
)
153
```
154

155
[Callbacks](./callbacks.md)
156

157
### Configuration and Utilities
158

159
Global configuration management, build information, and utility functions for customizing XGBoost behavior and accessing system information.
160

161
```python { .api }
162
def set_config(**kwargs): ...
163
def get_config(): ...
164
def config_context(**kwargs): ...
165
def build_info(): ...
166
```
167

168
[Configuration](./configuration.md)
169

170
## Types
171

172
### Core Types
173

174
```python { .api }
175
from typing import Dict, List, Optional, Union, Any
176
import numpy as np
177

178
# Data types
179
ArrayLike = Union[np.ndarray, List, tuple, 'pd.DataFrame', 'scipy.sparse.matrix']
180
FeatureNames = Optional[Union[str, List[str]]]
181
FeatureTypes = Optional[List[str]]
182

183
# Parameter types
184
BoosterParam = Dict[str, Any]
185
```