Tessl Tile for pypi/mlflow@3.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

client.md configuration.md data.md frameworks.md genai.md index.md models.md projects.md tracing.md tracking.md

projects.mddocs/

0
# MLflow Projects
1

2
MLflow Projects provide a standard format for packaging data science code in a reusable and reproducible way. Projects enable running ML workflows locally or on remote compute platforms with automatic environment management, parameter validation, and dependency tracking. Each project defines entry points, parameters, and environment specifications that can be executed across different backends.
3

4
## Capabilities
5

6
### Project Execution
7

8
Execute MLflow projects from local directories or remote Git repositories with comprehensive parameter validation and environment management.
9

10
```python { .api }
11
def run(uri, entry_point="main", version=None, parameters=None, docker_args=None, experiment_name=None, experiment_id=None, backend="local", backend_config=None, storage_dir=None, synchronous=True, run_id=None, run_name=None, env_manager=None, build_image=False, docker_auth=None):
12
    """
13
    Run MLflow project from local or remote URI.
14
    
15
    Parameters:
16
    - uri: str - Project URI (local path or Git repository)
17
    - entry_point: str - Entry point to run (default: "main")
18
    - version: str, optional - Git commit hash or branch name
19
    - parameters: dict, optional - Parameters for entry point command
20
    - docker_args: dict, optional - Docker execution arguments
21
    - experiment_name: str, optional - MLflow experiment name
22
    - experiment_id: str, optional - MLflow experiment ID
23
    - backend: str - Execution backend ("local", "databricks", "kubernetes")
24
    - backend_config: dict or str, optional - Backend configuration
25
    - storage_dir: str, optional - Directory for remote URI downloads
26
    - synchronous: bool - Wait for run completion (default: True)
27
    - run_id: str, optional - Specific MLflow run ID to use
28
    - run_name: str, optional - Name for the MLflow run
29
    - env_manager: str, optional - Environment manager ("local", "virtualenv", "uv", "conda")
30
    - build_image: bool - Build new docker image (default: False)
31
    - docker_auth: dict, optional - Docker registry authentication
32
    
33
    Returns:
34
    SubmittedRun object with run information and control methods
35
    """
36
```
37

38
### Run Management
39

40
Control and monitor submitted project runs with status tracking and cancellation capabilities.
41

42
```python { .api }
43
class SubmittedRun:
44
    """
45
    Represents a submitted MLflow project run.
46
    """
47
    
48
    @property
49
    def run_id(self) -> str:
50
        """MLflow run ID of the submitted project run."""
51
    
52
    def wait(self) -> bool:
53
        """
54
        Wait for run completion.
55
        
56
        Returns:
57
        bool - True if run completed successfully, False otherwise
58
        """
59
    
60
    def get_status(self) -> str:
61
        """
62
        Get current run status.
63
        
64
        Returns:
65
        str - Current status ("RUNNING", "FINISHED", "FAILED", "KILLED")
66
        """
67
    
68
    def cancel(self):
69
        """
70
        Cancel the running project and wait for termination.
71
        """
72
```
73

74
## Project Configuration Format
75

76
MLflow projects are defined using an `MLproject` file in YAML format that specifies entry points, parameters, and environment requirements.
77

78
### Basic Project Structure
79

80
```yaml
81
name: my_project
82

83
entry_points:
84
  main:
85
    parameters:
86
      data_path:
87
        type: path
88
        default: data/input.csv
89
      learning_rate:
90
        type: float 
91
        default: 0.01
92
      max_epochs:
93
        type: int
94
        default: 100
95
      model_name:
96
        type: string
97
        default: my_model
98
    command: "python train.py --data {data_path} --lr {learning_rate} --epochs {max_epochs} --name {model_name}"
99
  
100
  evaluate:
101
    parameters:
102
      model_uri:
103
        type: uri
104
      test_data:
105
        type: path
106
    command: "python evaluate.py --model {model_uri} --data {test_data}"
107

108
# Environment specification (choose one)
109
conda_env: conda.yaml
110
```
111

112
### Environment Types
113

114
**Conda Environment**:
115
```yaml
116
conda_env: conda.yaml  # Points to conda environment file
117
```
118

119
**Python Environment**:
120
```yaml
121
python_env: python_env.yaml  # Points to python environment file  
122
```
123

124
**Docker Environment**:
125
```yaml
126
docker_env:
127
  image: "tensorflow/tensorflow:2.8.0"
128
  volumes: ["/host/data:/container/data"]
129
  environment: 
130
    - ["CUDA_VISIBLE_DEVICES", "0"]
131
    - "PATH"  # Copy from host
132
```
133

134
### Parameter Types
135

136
**Supported parameter types with validation**:
137
- `string` - Basic string parameter
138
- `float` - Floating point numeric parameter  
139
- `int` - Integer numeric parameter
140
- `path` - File or directory path (downloads remote URIs)
141
- `uri` - URI parameter with validation
142

143
**Parameter Definition**:
144
```yaml
145
parameters:
146
  param_name:
147
    type: string|float|int|path|uri
148
    default: default_value  # Optional
149
```
150

151
## Execution Backends
152

153
### Local Backend
154

155
Execute projects on the local machine with environment isolation and dependency management.
156

157
```python
158
import mlflow.projects
159

160
# Run with local backend (default)
161
run = mlflow.projects.run(
162
    uri=".",
163
    entry_point="train",
164
    parameters={"learning_rate": 0.001},
165
    backend="local",
166
    env_manager="conda"  # or "virtualenv", "uv", "local"
167
)
168
```
169

170
### Databricks Backend
171

172
Execute projects on Databricks clusters with automatic cluster management and artifact storage.
173

174
```python
175
# Databricks backend configuration
176
backend_config = {
177
    "cluster_spec": {
178
        "spark_version": "7.3.x-scala2.12",
179
        "node_type_id": "i3.xlarge",
180
        "num_workers": 2
181
    }
182
}
183

184
run = mlflow.projects.run(
185
    uri="git+https://github.com/user/ml-project.git",
186
    backend="databricks", 
187
    backend_config=backend_config
188
)
189
```
190

191
### Kubernetes Backend
192

193
Execute projects as Kubernetes jobs with container orchestration and resource management.
194

195
```python
196
# Kubernetes backend with job template
197
backend_config = {
198
    "kube-job-template-path": "k8s-job-template.yaml",
199
    "kube-context": "my-k8s-context"
200
}
201

202
run = mlflow.projects.run(
203
    uri=".",
204
    backend="kubernetes",
205
    backend_config=backend_config,
206
    docker_args={"image": "my-project:latest"}
207
)
208
```
209

210
## Usage Examples
211

212
### Local Project Execution
213

214
```python
215
import mlflow.projects
216

217
# Simple local execution
218
run = mlflow.projects.run(
219
    uri=".",
220
    entry_point="main", 
221
    parameters={"alpha": 0.5, "l1_ratio": 0.1}
222
)
223

224
# Wait for completion and check status
225
success = run.wait()
226
print(f"Run {'succeeded' if success else 'failed'}")
227
```
228

229
### Remote Git Repository
230

231
```python  
232
# Run from Git repository with specific version
233
run = mlflow.projects.run(
234
    uri="https://github.com/mlflow/mlflow-example.git",
235
    version="main",
236
    entry_point="main",
237
    parameters={"alpha": 0.3},
238
    experiment_name="remote-experiment"
239
)
240
```
241

242
### Docker Environment
243

244
```python
245
# Run with Docker environment
246
run = mlflow.projects.run(
247
    uri=".",
248
    entry_point="train",
249
    backend="local",
250
    docker_args={
251
        "image": "tensorflow/tensorflow:2.8.0-gpu",
252
        "volumes": {"/data": "/workspace/data"}
253
    }
254
)
255
```
256

257
### Asynchronous Execution
258

259
```python
260
# Non-blocking execution with status monitoring
261
run = mlflow.projects.run(
262
    uri=".",
263
    synchronous=False
264
)
265

266
# Monitor status
267
while run.get_status() == "RUNNING":
268
    time.sleep(10)
269
    print("Still running...")
270

271
if run.get_status() == "FINISHED":
272
    print(f"Completed successfully. Run ID: {run.run_id}")
273
else:
274
    print("Run failed or was cancelled")
275
```
276

277
## Environment Management
278

279
MLflow Projects support multiple environment managers for dependency isolation:
280

281
### Conda Environment
282
- Requires `conda.yaml` file specifying dependencies
283
- Automatic environment creation and activation
284
- Full package and version management
285

286
### Python Virtual Environment  
287
- Uses `python_env.yaml` with pip requirements
288
- Lightweight alternative to conda
289
- Supports uv for faster package installation
290

291
### Docker Environment
292
- Container-based execution with full isolation
293
- Custom base images with pre-installed dependencies
294
- Volume mounting for data access
295

296
### Local Environment
297
- Execute in current Python environment
298
- No isolation but fastest startup
299
- Suitable for development and testing
300

301
## Project Templates
302

303
### Basic ML Training Project
304
```
305
my-ml-project/
306
├── MLproject
307
├── conda.yaml
308
├── train.py
309
├── evaluate.py
310
└── data/
311
    └── train.csv
312
```
313

314
### Multi-Step Pipeline Project
315
```
316
ml-pipeline/
317
├── MLproject
318
├── conda.yaml  
319
├── steps/
320
│   ├── data_prep.py
321
│   ├── train.py
322
│   └── evaluate.py
323
└── configs/
324
    └── model_config.yaml
325
```
326

327
## Types
328

329
```python { .api }
330
from mlflow.projects import SubmittedRun
331
from mlflow.projects.submitted_run import LocalSubmittedRun
332
from mlflow.exceptions import ExecutionException
333

334
class SubmittedRun:
335
    """Abstract base class for submitted project runs."""
336
    run_id: str
337
    
338
    def wait(self) -> bool: ...
339
    def get_status(self) -> str: ...
340
    def cancel(self) -> None: ...
341

342
class LocalSubmittedRun(SubmittedRun):
343
    """Local backend implementation of SubmittedRun."""
344
    pass
345

346
class ExecutionException(Exception):
347
    """Exception raised when project execution fails."""
348
    pass
349

350
# Backend types
351
Backend = Literal["local", "databricks", "kubernetes"]
352
EnvironmentManager = Literal["local", "virtualenv", "uv", "conda"] 
353
RunStatus = Literal["RUNNING", "FINISHED", "FAILED", "KILLED"]
354

355
# Parameter types
356
ProjectParameter = Dict[str, Union[str, float, int]]
357
BackendConfig = Union[Dict[str, Any], str]  # Dict or path to JSON file
358
DockerArgs = Dict[str, Union[str, Dict[str, str], List[str]]]
359
```

Version

Tile

Files

projects.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

projects.mddocs/