or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

client.mdconfiguration.mddata.mdframeworks.mdgenai.mdindex.mdmodels.mdprojects.mdtracing.mdtracking.md

index.mddocs/

0

# MLflow

1

2

MLflow is an open-source developer platform designed to build AI/LLM applications and models with confidence. It provides a comprehensive solution for managing the complete machine learning lifecycle, including experiment tracking, model management, deployment, and observability. The platform offers specialized features for both LLM/GenAI developers (tracing/observability, LLM evaluation, prompt management, version tracking) and traditional data scientists (experiment tracking, model registry, deployment tools).

3

4

## Package Information

5

6

- **Package Name**: mlflow

7

- **Language**: Python

8

- **Installation**: `pip install mlflow`

9

- **Documentation**: https://mlflow.org/docs/latest/index.html

10

11

## Core Imports

12

13

```python

14

import mlflow

15

```

16

17

For client API access:

18

19

```python

20

from mlflow import MlflowClient

21

client = MlflowClient()

22

```

23

24

For specific modules:

25

26

```python

27

import mlflow.tracking

28

import mlflow.models

29

import mlflow.data

30

import mlflow.tracing

31

```

32

33

## Basic Usage

34

35

```python

36

import mlflow

37

import mlflow.sklearn

38

from sklearn.ensemble import RandomForestRegressor

39

from sklearn.model_selection import train_test_split

40

import numpy as np

41

42

# Set tracking URI and experiment

43

mlflow.set_tracking_uri("http://localhost:5000")

44

mlflow.set_experiment("my-experiment")

45

46

# Generate sample data

47

X = np.random.rand(100, 4)

48

y = np.random.rand(100)

49

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

50

51

# Start MLflow run with context manager

52

with mlflow.start_run():

53

# Enable autologging for sklearn

54

mlflow.sklearn.autolog()

55

56

# Train model

57

model = RandomForestRegressor(n_estimators=50, random_state=42)

58

model.fit(X_train, y_train)

59

60

# Log custom parameters and metrics

61

mlflow.log_param("custom_param", "value")

62

mlflow.log_metric("custom_metric", 0.85)

63

64

# Log model manually (optional with autolog)

65

mlflow.sklearn.log_model(model, "model")

66

67

# Get run info

68

run = mlflow.active_run()

69

print(f"Run ID: {run.info.run_id}")

70

```

71

72

## Architecture

73

74

MLflow's modular architecture supports the complete ML lifecycle:

75

76

- **Tracking**: Experiment and run management with parameter, metric, and artifact logging

77

- **Models**: Universal model format with deployment capabilities across platforms

78

- **Model Registry**: Centralized model store with versioning and stage transitions

79

- **Projects**: Reproducible ML code packaging with dependency management

80

- **Tracing**: Distributed tracing for LLM/GenAI applications with observability

81

- **Data**: Dataset tracking and lineage with multiple format support

82

- **Evaluation**: Model performance assessment with built-in metrics and custom evaluators

83

84

The platform integrates natively with 25+ ML frameworks and 15+ LLM/GenAI libraries, providing automatic logging capabilities and standardized model formats for seamless deployment across infrastructures.

85

86

## Capabilities

87

88

### Tracking and Experiment Management

89

90

Core functionality for tracking experiments, runs, parameters, metrics, and artifacts. Provides both fluent API for interactive use and client API for programmatic access.

91

92

```python { .api }

93

def start_run(run_id=None, experiment_id=None, run_name=None, nested=False, tags=None, description=None): ...

94

def end_run(status=None): ...

95

def log_param(key, value): ...

96

def log_metric(key, value, step=None, timestamp=None, synchronous=None): ...

97

def log_artifact(local_path, artifact_path=None, synchronous=None): ...

98

def log_outputs(outputs, artifact_path=None): ...

99

def log_assessment(assessment, request_id=None, run_id=None, timestamp_ms=None): ...

100

def log_feedback(feedback, request_id=None, run_id=None): ...

101

def create_experiment(name, artifact_location=None, tags=None): ...

102

def set_experiment(experiment_name=None, experiment_id=None): ...

103

```

104

105

[Tracking and Experiments](./tracking.md)

106

107

### Client API

108

109

Lower-level programmatic interface providing direct access to MLflow's REST API with comprehensive methods for managing experiments, runs, models, and artifacts.

110

111

```python { .api }

112

class MlflowClient:

113

def __init__(self, tracking_uri=None, registry_uri=None): ...

114

def create_experiment(self, name, artifact_location=None, tags=None): ...

115

def get_run(self, run_id): ...

116

def search_runs(self, experiment_ids, filter_string="", run_view_type=ViewType.ACTIVE_ONLY, max_results=SEARCH_MAX_RESULTS_DEFAULT, order_by=None, page_token=None): ...

117

def log_batch(self, run_id, metrics=None, params=None, tags=None): ...

118

```

119

120

[Client API](./client.md)

121

122

### Model Management

123

124

Comprehensive model lifecycle management including logging, loading, evaluation, and deployment with support for multiple ML frameworks and custom models.

125

126

```python { .api }

127

def log_model(model, artifact_path, **kwargs): ...

128

def load_model(model_uri, dst_path=None, **kwargs): ...

129

def evaluate(model=None, data=None, targets=None, model_type=None, evaluators=None, evaluator_config=None, **kwargs): ...

130

def register_model(model_uri, name, await_registration_for=DEFAULT_AWAIT_MAX_SLEEP_SECONDS, tags=None, **kwargs): ...

131

```

132

133

[Models](./models.md)

134

135

### Data Management

136

137

Dataset tracking and lineage capabilities supporting multiple data formats including pandas, numpy, Spark, Delta, and HuggingFace datasets with comprehensive metadata management.

138

139

```python { .api }

140

def from_pandas(df, source=None, targets=None, name=None, digest=None, predictions=None): ...

141

def from_numpy(features, source=None, targets=None, name=None, digest=None, predictions=None): ...

142

def from_spark(df, source=None, targets=None, name=None, digest=None, predictions=None): ...

143

def log_input(dataset, context=None, tags=None): ...

144

def log_inputs(datasets, tags=None): ...

145

```

146

147

[Data Management](./data.md)

148

149

### Tracing and Observability

150

151

Distributed tracing system for LLM/GenAI applications providing observability, debugging, and performance monitoring with span management and assessment capabilities.

152

153

```python { .api }

154

def trace(name=None, span_type=None, inputs=None, attributes=None): ...

155

def start_span(name, span_type=None, inputs=None, parent_id=None, attributes=None): ...

156

def get_trace(request_id): ...

157

def search_traces(experiment_ids=None, filter_string="", max_results=None, order_by=None, run_id=None): ...

158

def log_assessment(assessment, request_id=None, run_id=None, timestamp_ms=None): ...

159

```

160

161

[Tracing and Observability](./tracing.md)

162

163

### Configuration and System Management

164

165

System configuration including tracking URIs, model registry settings, system metrics, and authentication management for MLflow deployments.

166

167

```python { .api }

168

def set_tracking_uri(uri): ...

169

def get_tracking_uri(): ...

170

def set_registry_uri(uri): ...

171

def enable_system_metrics_logging(): ...

172

def disable_system_metrics_logging(): ...

173

def login(): ...

174

```

175

176

[Configuration](./configuration.md)

177

178

### GenAI and LLM Integration

179

180

Specialized capabilities for LLM/GenAI workflows including prompt management, LLM evaluation, and integration with popular AI frameworks and libraries.

181

182

```python { .api }

183

def load_prompt(model_name, model_version=None, model_alias=None): ...

184

def register_prompt(prompt, name, version=None, tags=None, description=None, metadata=None): ...

185

def search_prompts(filter_string=None, max_results=None, order_by=None, page_token=None): ...

186

```

187

188

[GenAI and LLM](./genai.md)

189

190

### ML Framework Integrations

191

192

Native integrations with 25+ ML frameworks providing automatic logging, model serialization, and deployment capabilities with framework-specific optimizations.

193

194

```python { .api }

195

# Popular integrations (via lazy loading)

196

import mlflow.sklearn

197

import mlflow.pytorch

198

import mlflow.tensorflow

199

import mlflow.keras

200

import mlflow.xgboost

201

import mlflow.lightgbm

202

import mlflow.transformers

203

```

204

205

[Framework Integrations](./frameworks.md)

206

207

### MLflow Projects

208

209

Reproducible ML project execution with environment management, parameter validation, and multi-backend support for local, cloud, and containerized workflows.

210

211

```python { .api }

212

import mlflow.projects

213

214

def run(uri, entry_point="main", version=None, parameters=None, backend="local", backend_config=None, synchronous=True, **kwargs): ...

215

216

class SubmittedRun:

217

run_id: str

218

def wait(self) -> bool: ...

219

def get_status(self) -> str: ...

220

def cancel(self): ...

221

```

222

223

[MLflow Projects](./projects.md)

224

225

## Types

226

227

```python { .api }

228

from mlflow.entities import Experiment, Run, RunInfo, RunData, Metric, Param, RegisteredModel, ModelVersion

229

from mlflow.tracking.fluent import ActiveRun

230

from mlflow.client import MlflowClient

231

from mlflow.exceptions import MlflowException

232

233

class Experiment:

234

experiment_id: str

235

name: str

236

artifact_location: str

237

lifecycle_stage: str

238

tags: Dict[str, str]

239

240

class Run:

241

info: RunInfo

242

data: RunData

243

244

class RunInfo:

245

run_id: str

246

experiment_id: str

247

status: str

248

start_time: int

249

end_time: int

250

artifact_uri: str

251

252

class ActiveRun:

253

info: RunInfo

254

data: RunData

255

256

class MlflowException(Exception):

257

error_code: str

258

message: str

259

```