0
# ZenML
1
2
ZenML is a unified MLOps framework that extends battle-tested machine learning operations principles to support the entire AI stack, from classical machine learning models to advanced AI agents. The framework provides comprehensive pipeline orchestration, experiment tracking, model versioning, and reproducibility management with integration capabilities across major cloud platforms and ML tools.
3
4
## Package Information
5
6
- **Package Name**: zenml
7
- **Language**: Python
8
- **Installation**: `pip install zenml`
9
- **Version**: 0.90.0
10
- **Documentation**: https://docs.zenml.io
11
12
## Core Imports
13
14
```python
15
import zenml
16
from zenml import pipeline, step
17
from zenml import Model, ArtifactConfig
18
from zenml.client import Client
19
```
20
21
## Basic Usage
22
23
```python
24
from zenml import pipeline, step, Model
25
from zenml.client import Client
26
27
# Define a step
28
@step
29
def load_data() -> dict:
30
"""Load training data."""
31
return {"train": [1, 2, 3], "test": [4, 5, 6]}
32
33
@step
34
def train_model(data: dict) -> float:
35
"""Train a model and return accuracy."""
36
# Training logic here
37
return 0.95
38
39
# Define a pipeline
40
@pipeline(
41
name="ml_pipeline",
42
enable_cache=True,
43
model=Model(name="my_model", version="1.0.0")
44
)
45
def ml_pipeline():
46
"""ML training pipeline."""
47
data = load_data()
48
accuracy = train_model(data)
49
return accuracy
50
51
# Run the pipeline
52
if __name__ == "__main__":
53
ml_pipeline()
54
55
# Access results via Client
56
client = Client()
57
latest_run = client.get_pipeline("ml_pipeline").runs[0]
58
print(f"Pipeline run: {latest_run.id}")
59
```
60
61
## Architecture
62
63
ZenML's architecture is built on several key abstractions:
64
65
- **Pipelines**: Directed acyclic graphs (DAGs) of steps that define ML workflows
66
- **Steps**: Individual processing units that perform specific tasks (data loading, training, evaluation)
67
- **Stacks**: Complete infrastructure configurations combining orchestrators, artifact stores, and other components
68
- **Artifacts**: Data objects produced and consumed by steps, with automatic versioning and lineage tracking
69
- **Model Control Plane**: Centralized model namespace for grouping artifacts, metadata, and versions
70
- **Client**: Programmatic interface for managing all ZenML resources
71
- **Materializers**: Pluggable serialization system for converting Python objects to/from storage
72
- **Integrations**: 67+ integrations with cloud providers, ML frameworks, and MLOps tools
73
74
## Capabilities
75
76
### Pipeline and Step Decorators
77
78
Core decorators for defining ML workflows and their constituent steps, with support for configuration, caching, hooks, and execution contexts.
79
80
```python { .api }
81
def pipeline(
82
_func=None,
83
*,
84
name: str = None,
85
enable_cache: bool = None,
86
enable_artifact_metadata: bool = None,
87
enable_step_logs: bool = None,
88
environment: dict = None,
89
secrets: list = None,
90
enable_pipeline_logs: bool = None,
91
settings: dict = None,
92
tags: list = None,
93
extra: dict = None,
94
on_failure=None,
95
on_success=None,
96
on_init=None,
97
on_init_kwargs: dict = None,
98
on_cleanup=None,
99
model: Model = None,
100
retry=None,
101
substitutions: dict = None,
102
execution_mode=None,
103
cache_policy=None
104
):
105
"""
106
Decorator to define a ZenML pipeline.
107
108
Parameters:
109
- name: Pipeline name (defaults to function name)
110
- enable_cache: Enable step caching
111
- enable_artifact_metadata: Enable artifact metadata logging
112
- enable_step_logs: Enable step logging
113
- environment: Environment variables to set when running this pipeline
114
- secrets: Secrets to set as environment variables (list of UUIDs or names)
115
- enable_pipeline_logs: Enable pipeline logs
116
- settings: Stack component settings
117
- tags: Tags to apply to runs of the pipeline
118
- extra: Extra pipeline metadata
119
- on_failure: Failure hook callable
120
- on_success: Success hook callable
121
- on_init: Callback function to run on initialization of the pipeline
122
- on_init_kwargs: Arguments for the init hook
123
- on_cleanup: Callback function to run on cleanup of the pipeline
124
- model: Model configuration for Model Control Plane
125
- retry: Retry configuration for the pipeline steps
126
- substitutions: Extra placeholders to use in the name templates
127
- execution_mode: The execution mode to use for the pipeline
128
- cache_policy: Cache policy for this pipeline
129
130
Returns:
131
Pipeline decorator function
132
"""
133
134
def step(
135
_func=None,
136
*,
137
name: str = None,
138
enable_cache: bool = None,
139
enable_artifact_metadata: bool = None,
140
enable_artifact_visualization: bool = None,
141
enable_step_logs: bool = None,
142
experiment_tracker: bool | str = None,
143
step_operator: bool | str = None,
144
output_materializers=None,
145
environment: dict = None,
146
secrets: list = None,
147
settings: dict = None,
148
extra: dict = None,
149
on_failure=None,
150
on_success=None,
151
model: Model = None,
152
retry=None,
153
substitutions: dict = None,
154
cache_policy=None
155
):
156
"""
157
Decorator to define a ZenML step.
158
159
Parameters:
160
- name: Step name (defaults to function name)
161
- enable_cache: Enable caching for this step
162
- enable_artifact_metadata: Enable artifact metadata for this step
163
- enable_artifact_visualization: Enable artifact visualization for this step
164
- enable_step_logs: Enable step logs for this step
165
- experiment_tracker: Experiment tracker to use
166
- step_operator: Step operator for remote execution
167
- output_materializers: Custom materializers for outputs
168
- environment: Environment variables to set when running this step
169
- secrets: Secrets to set as environment variables (list of UUIDs or names)
170
- settings: Stack component settings
171
- extra: Extra step metadata
172
- on_failure: Failure hook callable
173
- on_success: Success hook callable
174
- model: Model configuration
175
- retry: Retry configuration in case of step failure
176
- substitutions: Extra placeholders for the step name
177
- cache_policy: Cache policy for this step
178
179
Returns:
180
Step decorator function
181
"""
182
183
def get_pipeline_context():
184
"""
185
Get the current pipeline execution context.
186
187
Returns:
188
PipelineContext: Context object with pipeline metadata
189
190
Raises:
191
RuntimeError: If called outside pipeline execution
192
"""
193
194
def get_step_context():
195
"""
196
Get the current step execution context.
197
198
Returns:
199
StepContext: Context object with step metadata and utilities
200
201
Raises:
202
RuntimeError: If called outside step execution
203
"""
204
```
205
206
[Pipelines and Steps](./pipelines-and-steps.md)
207
208
### Artifact Management
209
210
Functions for saving, loading, registering artifacts, and logging metadata outside the standard step output flow.
211
212
```python { .api }
213
def save_artifact(
214
data,
215
name: str,
216
version: str = None,
217
artifact_type = None,
218
tags: list = None,
219
extract_metadata: bool = True,
220
include_visualizations: bool = True,
221
user_metadata: dict = None,
222
materializer: type = None,
223
uri: str = None
224
):
225
"""
226
Save an artifact to the artifact store.
227
228
Parameters:
229
- data: Data to save
230
- name: Artifact name
231
- version: Artifact version (auto-generated if None)
232
- artifact_type: Type of artifact (e.g., ArtifactType.MODEL, ArtifactType.DATA)
233
- tags: List of tags
234
- extract_metadata: Extract metadata automatically
235
- include_visualizations: Generate visualizations
236
- user_metadata: Custom metadata dict
237
- materializer: Custom materializer class
238
- uri: Optional URI to use for the artifact (advanced usage)
239
240
Returns:
241
ArtifactVersionResponse: Created artifact version
242
"""
243
244
def load_artifact(
245
name_or_id: str,
246
version: str = None
247
):
248
"""
249
Load an artifact from the artifact store.
250
251
Parameters:
252
- name_or_id: Artifact name or UUID
253
- version: Artifact version (loads latest if None)
254
255
Returns:
256
Data object loaded from artifact store
257
"""
258
259
def register_artifact(
260
uri: str,
261
name: str,
262
version: str = None,
263
tags: list = None,
264
user_metadata: dict = None,
265
artifact_type=None,
266
materializer: type = None
267
):
268
"""
269
Register an existing artifact from a URI.
270
271
Parameters:
272
- uri: URI of existing artifact
273
- name: Artifact name
274
- version: Artifact version
275
- tags: List of tags
276
- user_metadata: Custom metadata
277
- artifact_type: Type of artifact
278
- materializer: Custom materializer
279
280
Returns:
281
ArtifactVersionResponse: Registered artifact version
282
"""
283
284
def log_artifact_metadata(
285
metadata: dict,
286
artifact_name: str = None,
287
artifact_version: str = None
288
):
289
"""
290
Log metadata for an artifact.
291
292
Parameters:
293
- metadata: Metadata dict to log
294
- artifact_name: Artifact name (uses current context if None)
295
- artifact_version: Artifact version
296
"""
297
```
298
299
[Artifacts](./artifacts.md)
300
301
### Model Control Plane
302
303
The Model class and related functions for organizing artifacts, metadata, and versions in a centralized model namespace.
304
305
```python { .api }
306
class Model:
307
"""
308
Model configuration for grouping artifacts and metadata.
309
310
Attributes:
311
- name: Model name
312
- version: Model version or stage
313
- license: Model license
314
- description: Model description
315
- audience: Target audience
316
- use_cases: Use cases
317
- limitations: Known limitations
318
- trade_offs: Trade-offs made
319
- ethics: Ethical considerations
320
- tags: List of tags
321
- save_models_to_registry: Auto-save to model registry
322
- suppress_class_validation_warnings: Suppress warnings
323
"""
324
325
def __init__(
326
self,
327
name: str,
328
version: str = None,
329
license: str = None,
330
description: str = None,
331
audience: str = None,
332
use_cases: str = None,
333
limitations: str = None,
334
trade_offs: str = None,
335
ethics: str = None,
336
tags: list = None,
337
save_models_to_registry: bool = True,
338
suppress_class_validation_warnings: bool = False
339
):
340
"""Initialize Model configuration."""
341
342
def log_model_metadata(
343
metadata: dict,
344
model_name: str = None,
345
model_version: str = None
346
):
347
"""
348
Log metadata for a model version.
349
350
Parameters:
351
- metadata: Metadata dict to log
352
- model_name: Model name (uses current context if None)
353
- model_version: Model version
354
"""
355
356
def link_artifact_to_model(
357
artifact_version,
358
model=None
359
):
360
"""
361
Link an artifact to a model version.
362
363
Parameters:
364
- artifact_version: ArtifactVersionResponse object to link
365
- model: Model object to link to (uses current context if None)
366
367
Raises:
368
RuntimeError: If called without model parameter and no model context exists
369
"""
370
```
371
372
[Models](./models.md)
373
374
### Client API
375
376
The Client class provides programmatic access to all ZenML resources including stacks, pipelines, artifacts, models, secrets, and more.
377
378
```python { .api }
379
class Client:
380
"""
381
Main interface for interacting with ZenML programmatically.
382
383
Singleton instance access:
384
client = Client()
385
"""
386
387
@staticmethod
388
def get_instance():
389
"""Get singleton client instance."""
390
391
@property
392
def active_stack():
393
"""Get the active stack object."""
394
395
@property
396
def active_stack_model():
397
"""Get the active stack model."""
398
399
@property
400
def active_project():
401
"""Get the active project."""
402
403
@property
404
def active_user():
405
"""Get the active user."""
406
```
407
408
[Client](./client.md)
409
410
### Stack Management
411
412
Stack and stack component classes for configuring ML infrastructure.
413
414
```python { .api }
415
class Stack:
416
"""Complete ZenML stack configuration."""
417
418
class StackComponent:
419
"""Base class for stack components."""
420
421
class StackComponentConfig:
422
"""Base configuration for stack components."""
423
424
class Flavor:
425
"""Flavor of a stack component."""
426
```
427
428
[Stacks](./stacks.md)
429
430
### Configuration
431
432
Configuration classes for Docker, resources, scheduling, caching, and more.
433
434
```python { .api }
435
class DockerSettings:
436
"""Configuration for Docker containerization."""
437
438
class ResourceSettings:
439
"""Resource allocation settings for steps."""
440
441
class Schedule:
442
"""Schedule configuration for pipeline runs."""
443
444
class StepRetryConfig:
445
"""Configuration for step retry behavior."""
446
447
class CachePolicy:
448
"""Configuration for step caching behavior."""
449
```
450
451
[Configuration](./config.md)
452
453
### Materializers
454
455
Built-in materializers for serializing and deserializing Python objects.
456
457
```python { .api }
458
class BuiltInMaterializer:
459
"""Materializer for built-in Python types."""
460
461
class BuiltInContainerMaterializer:
462
"""Materializer for container types (list, dict, tuple, set)."""
463
464
class CloudpickleMaterializer:
465
"""Materializer using cloudpickle for serialization."""
466
467
class PydanticMaterializer:
468
"""Materializer for Pydantic models."""
469
```
470
471
[Materializers](./materializers.md)
472
473
### Stack Components
474
475
Base classes and implementations for stack components including orchestrators, artifact stores, container registries, and more.
476
477
[Stack Components](./stack-components.md)
478
479
### Integrations
480
481
ZenML includes 67 integrations with cloud providers, ML frameworks, orchestrators, experiment trackers, and MLOps tools.
482
483
[Integrations](./integrations.md)
484
485
### Metadata and Tags
486
487
Functions for logging metadata and managing tags across resources.
488
489
```python { .api }
490
def log_metadata(
491
metadata: dict,
492
infer_resource: bool = True
493
):
494
"""
495
Generic function to log metadata.
496
497
Parameters:
498
- metadata: Metadata dict to log
499
- infer_resource: Infer resource from context
500
"""
501
502
def log_step_metadata(
503
metadata: dict,
504
step_name: str = None
505
):
506
"""
507
Log metadata for a step.
508
509
Parameters:
510
- metadata: Metadata dict to log
511
- step_name: Step name (uses current context if None)
512
"""
513
514
def add_tags(
515
tags: list,
516
*,
517
pipeline: str = None,
518
run: str = None,
519
artifact: str = None,
520
# ... additional resource type parameters
521
):
522
"""
523
Add tags to various resource types.
524
525
Parameters:
526
- tags: List of tag names to add
527
- pipeline: ID or name of pipeline to tag
528
- run: ID or name of pipeline run to tag
529
- artifact: ID or name of artifact to tag
530
- (additional parameters for other resource types)
531
"""
532
533
def remove_tags(
534
tags: list,
535
*,
536
pipeline: str = None,
537
run: str = None,
538
artifact: str = None,
539
# ... additional resource type parameters
540
):
541
"""
542
Remove tags from various resource types.
543
544
Parameters:
545
- tags: List of tag names to remove
546
- pipeline: ID or name of pipeline
547
- run: ID or name of pipeline run
548
- artifact: ID or name of artifact
549
- (additional parameters for other resource types)
550
"""
551
552
class Tag:
553
"""Tag model for categorizing resources."""
554
```
555
556
[Metadata and Tags](./metadata-tags.md)
557
558
### Hooks
559
560
Pre-built hooks for alerting and custom hook utilities.
561
562
```python { .api }
563
def alerter_success_hook() -> None:
564
"""
565
Standard success hook that executes after step finishes successfully.
566
567
This hook uses any `BaseAlerter` configured in the active stack
568
to post a success notification message.
569
"""
570
571
def alerter_failure_hook(exception: BaseException) -> None:
572
"""
573
Standard failure hook that executes after step fails.
574
575
This hook uses any `BaseAlerter` configured in the active stack
576
to post a failure notification message with exception details.
577
578
Parameters:
579
- exception: Original exception that led to step failing
580
"""
581
582
def resolve_and_validate_hook(hook):
583
"""
584
Utility to resolve and validate custom hooks.
585
586
Parameters:
587
- hook: Hook specification
588
589
Returns:
590
Resolved hook callable
591
"""
592
```
593
594
[Hooks](./hooks.md)
595
596
### Exceptions
597
598
Exception classes for error handling.
599
600
```python { .api }
601
class ZenMLBaseException(Exception):
602
"""Base exception for all ZenML exceptions."""
603
604
class AuthorizationException(ZenMLBaseException):
605
"""Authorization/access errors."""
606
607
class DoesNotExistException(ZenMLBaseException):
608
"""Entity not found errors."""
609
610
class ValidationError(ZenMLBaseException):
611
"""Model/data validation errors."""
612
613
class EntityExistsError(ZenMLBaseException):
614
"""Entity already exists errors."""
615
```
616
617
[Exceptions](./exceptions.md)
618
619
### Enums
620
621
Important enumerations used throughout the API.
622
623
```python { .api }
624
class ExecutionStatus(str, Enum):
625
"""Pipeline/step execution status."""
626
INITIALIZING = "initializing"
627
PROVISIONING = "provisioning"
628
FAILED = "failed"
629
COMPLETED = "completed"
630
RUNNING = "running"
631
CACHED = "cached"
632
633
class StackComponentType(str, Enum):
634
"""Stack component types."""
635
ORCHESTRATOR = "orchestrator"
636
ARTIFACT_STORE = "artifact_store"
637
CONTAINER_REGISTRY = "container_registry"
638
# ... and more
639
640
class ModelStages(str, Enum):
641
"""Model lifecycle stages."""
642
STAGING = "staging"
643
PRODUCTION = "production"
644
ARCHIVED = "archived"
645
LATEST = "latest"
646
```
647
648
[Enums](./enums.md)
649
650
### Artifact Configuration
651
652
Artifact configuration classes for controlling step outputs.
653
654
```python { .api }
655
class ArtifactConfig:
656
"""
657
Configuration for artifacts produced by steps.
658
659
Attributes:
660
- name: Artifact name
661
- version: Artifact version strategy
662
- tags: List of tags
663
- run_metadata: Metadata to attach
664
- artifact_type: Optional type of the artifact
665
"""
666
667
def __init__(
668
self,
669
name: str = None,
670
version: str = None,
671
tags: list = None,
672
run_metadata: dict = None,
673
artifact_type = None
674
):
675
"""Initialize artifact configuration."""
676
677
class ExternalArtifact:
678
"""
679
External artifacts provide values as input to ZenML steps.
680
681
Can be used to provide any value as input to a step without writing
682
an additional step that returns this value.
683
684
Attributes:
685
- value: The artifact value (any Python object)
686
- materializer: Materializer to use for saving the artifact value
687
- store_artifact_metadata: Whether to store metadata
688
- store_artifact_visualizations: Whether to store visualizations
689
"""
690
691
def __init__(
692
self,
693
value = None,
694
materializer: type = None,
695
store_artifact_metadata: bool = True,
696
store_artifact_visualizations: bool = True
697
):
698
"""Initialize external artifact with a value to upload."""
699
```
700
701
[Artifact Configuration](./artifact-config.md)
702
703
### Utilities
704
705
Utility functions for dashboard, environment, and other operations.
706
707
```python { .api }
708
def show(port: int = None):
709
"""
710
Opens the ZenML dashboard in a browser.
711
712
Parameters:
713
- port: Port number (optional)
714
"""
715
```
716
717
[Utilities](./utilities.md)
718
719
### Services
720
721
Service abstractions for long-running processes like model servers and deployments.
722
723
```python { .api }
724
class BaseService:
725
"""Abstract base class for services."""
726
727
class ServiceConfig:
728
"""Configuration for services."""
729
730
class ServiceStatus:
731
"""Status tracking for services."""
732
```
733
734
[Services](./services.md)
735
736
### Pydantic Models
737
738
200+ Pydantic model classes for API request/response objects.
739
740
[Pydantic Models](./pydantic-models.md)
741
742
### Types
743
744
Custom type definitions for specialized content.
745
746
```python { .api }
747
class HTMLString(str):
748
"""String subclass for HTML content."""
749
750
class MarkdownString(str):
751
"""String subclass for Markdown content."""
752
753
class CSVString(str):
754
"""String subclass for CSV content."""
755
756
class JSONString(str):
757
"""String subclass for JSON content."""
758
```
759
760
[Types](./types.md)
761