Docker integration provider for Apache Airflow workflows, enabling containerized task execution and Docker Swarm orchestration.
npx @tessl/cli install tessl/pypi-apache-airflow-providers-docker@4.4.00
# Apache Airflow Providers Docker
1
2
Docker integration provider for Apache Airflow that enables containerized task execution and orchestration. This provider allows you to run tasks in Docker containers, manage Docker Swarm services, and integrate Docker daemon operations into your Airflow workflows with comprehensive configuration options for networking, volumes, security, and resource management.
3
4
## Package Information
5
6
- **Package Name**: apache-airflow-providers-docker
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install apache-airflow-providers-docker`
10
- **Requires**: Apache Airflow >=2.10.0, docker >=7.1.0
11
12
## Core Imports
13
14
```python
15
from airflow.providers.docker.operators.docker import DockerOperator
16
from airflow.providers.docker.operators.docker_swarm import DockerSwarmOperator
17
from airflow.providers.docker.hooks.docker import DockerHook
18
from airflow.providers.docker.decorators.docker import docker_task
19
```
20
21
Exception handling:
22
23
```python
24
from airflow.providers.docker.exceptions import (
25
DockerContainerFailedException,
26
DockerContainerFailedSkipException
27
)
28
```
29
30
## Basic Usage
31
32
```python
33
from airflow import DAG
34
from airflow.providers.docker.operators.docker import DockerOperator
35
from datetime import datetime
36
37
# Define DAG
38
dag = DAG(
39
'docker_example',
40
start_date=datetime(2024, 1, 1),
41
schedule_interval='@daily'
42
)
43
44
# Basic container execution
45
run_container = DockerOperator(
46
task_id='run_python_container',
47
image='python:3.9-slim',
48
command=['python', '-c', 'print("Hello from Docker!")'],
49
dag=dag
50
)
51
52
# Container with volume mounting and environment variables
53
process_data = DockerOperator(
54
task_id='process_data',
55
image='python:3.9',
56
command=['python', '/app/process.py'],
57
mounts=[
58
{
59
'type': 'bind',
60
'source': '/host/data',
61
'target': '/app/data'
62
}
63
],
64
environment={
65
'ENV': 'production',
66
'LOG_LEVEL': 'info'
67
},
68
dag=dag
69
)
70
71
# Using the docker_task decorator
72
@docker_task(image='python:3.9')
73
def containerized_function():
74
import json
75
result = {"message": "Processing complete", "status": "success"}
76
return json.dumps(result)
77
78
containerized_task = containerized_function()
79
```
80
81
## Architecture
82
83
The provider follows Apache Airflow's standard architecture patterns:
84
85
- **Operators**: High-level task definitions that extend BaseOperator for DAG integration
86
- **Hooks**: Low-level API clients that handle Docker daemon connections and operations
87
- **Decorators**: Python function wrappers that transform regular functions into containerized tasks
88
- **Connection Types**: Airflow connection definitions for Docker daemon configuration
89
- **Exceptions**: Custom error handling for container execution failures
90
91
The provider supports both single-container execution (DockerOperator) and distributed service orchestration (DockerSwarmOperator), with comprehensive configuration options for production deployments including TLS security, resource constraints, networking, and volume management.
92
93
## Capabilities
94
95
### Docker Container Execution
96
97
Execute commands and run tasks inside Docker containers with full control over container configuration, resource allocation, networking, and volume mounting. Supports both simple command execution and complex containerized workflows.
98
99
```python { .api }
100
class DockerOperator(BaseOperator):
101
def __init__(
102
self,
103
*,
104
image: str,
105
command: str | list[str] | None = None,
106
environment: dict | None = None,
107
mounts: list[Mount] | None = None,
108
**kwargs
109
) -> None: ...
110
111
def execute(self, context: Context) -> list[str] | str | None: ...
112
```
113
114
[Docker Container Operations](./docker-operations.md)
115
116
### Docker Swarm Services
117
118
Deploy and manage Docker Swarm services for distributed containerized workloads. Provides orchestration capabilities for multi-container applications with service discovery, load balancing, and scaling features.
119
120
```python { .api }
121
class DockerSwarmOperator(DockerOperator):
122
def __init__(
123
self,
124
*,
125
image: str,
126
configs: list | None = None,
127
secrets: list | None = None,
128
networks: list | None = None,
129
**kwargs
130
) -> None: ...
131
132
def execute(self, context: Context) -> None: ...
133
```
134
135
[Docker Swarm Orchestration](./docker-swarm.md)
136
137
### Docker API Integration
138
139
Low-level Docker API client for direct Docker daemon interactions, connection management, and custom Docker operations not covered by the operators.
140
141
```python { .api }
142
class DockerHook(BaseHook):
143
def __init__(
144
self,
145
docker_conn_id: str | None = "docker_default",
146
base_url: str | list[str] | None = None,
147
version: str | None = None,
148
**kwargs
149
) -> None: ...
150
151
def get_conn(self) -> APIClient: ...
152
def api_client(self) -> APIClient: ...
153
```
154
155
[Docker API Client](./docker-api.md)
156
157
### Task Decorators
158
159
Transform Python functions into containerized tasks using the @docker_task decorator. Provides seamless integration of containerized execution with Python function workflows.
160
161
```python { .api }
162
def docker_task(
163
image: str,
164
python_command: str = "python",
165
serializer: Literal["pickle", "dill", "cloudpickle"] = "pickle",
166
multiple_outputs: bool | None = None,
167
**kwargs
168
) -> TaskDecorator: ...
169
```
170
171
[Containerized Task Decorators](./docker-decorators.md)
172
173
### Error Handling
174
175
Custom exception classes for Docker container execution failures, providing detailed error information and logs for debugging containerized tasks.
176
177
```python { .api }
178
class DockerContainerFailedException(AirflowException):
179
def __init__(
180
self,
181
message: str | None = None,
182
logs: list[str | bytes] | None = None
183
) -> None: ...
184
185
class DockerContainerFailedSkipException(AirflowSkipException):
186
def __init__(
187
self,
188
message: str | None = None,
189
logs: list[str | bytes] | None = None
190
) -> None: ...
191
```
192
193
[Error Management](./error-handling.md)
194
195
## Types
196
197
```python { .api }
198
# Docker mount configuration
199
Mount = docker.types.Mount
200
201
# Docker device requests for GPU access
202
DeviceRequest = docker.types.DeviceRequest
203
204
# Docker ulimit configuration
205
Ulimit = docker.types.Ulimit
206
207
# Docker log configuration
208
LogConfig = docker.types.LogConfig
209
210
# Connection context for task execution
211
Context = dict[str, Any]
212
```