Apache Airflow provider package for IMAP email server integration and attachment processing
npx @tessl/cli install tessl/pypi-apache-airflow-providers-imap@3.9.00
# Apache Airflow IMAP Provider
1
2
Apache Airflow provider package for IMAP (Internet Message Access Protocol) integration, enabling workflows to connect to mail servers, search for emails with specific attachments, and download them for processing within Airflow DAGs.
3
4
## Package Information
5
6
- **Package Name**: apache-airflow-providers-imap
7
- **Language**: Python
8
- **Installation**: `pip install apache-airflow-providers-imap`
9
- **Airflow Version**: Requires Apache Airflow >=2.10.0
10
11
## Core Imports
12
13
Main classes and functionality:
14
15
```python
16
from airflow.providers.imap.hooks.imap import ImapHook
17
from airflow.providers.imap.sensors.imap_attachment import ImapAttachmentSensor
18
```
19
20
Package version information:
21
22
```python
23
from airflow.providers.imap import __version__
24
```
25
26
## Basic Usage
27
28
```python
29
from airflow.providers.imap.hooks.imap import ImapHook
30
from airflow.providers.imap.sensors.imap_attachment import ImapAttachmentSensor
31
from airflow import DAG
32
from datetime import datetime
33
34
# Using ImapHook to download attachments
35
def download_email_attachments():
36
with ImapHook(imap_conn_id="my_imap_conn") as hook:
37
# Check if attachment exists
38
has_attachment = hook.has_mail_attachment(
39
name="report.csv",
40
mail_folder="INBOX"
41
)
42
43
if has_attachment:
44
# Download attachments to local directory
45
hook.download_mail_attachments(
46
name="report.csv",
47
local_output_directory="/tmp/downloads"
48
)
49
50
# Using ImapAttachmentSensor to wait for attachments
51
dag = DAG('email_processing', start_date=datetime(2024, 1, 1))
52
53
sensor = ImapAttachmentSensor(
54
task_id="wait_for_report",
55
attachment_name="daily_report.xlsx",
56
check_regex=False,
57
mail_folder="INBOX",
58
conn_id="imap_default",
59
dag=dag
60
)
61
```
62
63
## Architecture
64
65
The IMAP provider follows Airflow's standard provider architecture:
66
67
- **Hooks**: Low-level interfaces for interacting with external systems (IMAP servers)
68
- **Sensors**: Operators that wait for specific conditions (email attachments)
69
- **Connection Management**: Secure connection handling with SSL/TLS support
70
- **Version Compatibility**: Seamless integration across Airflow 2.x and 3.x versions
71
72
## Capabilities
73
74
### IMAP Connection and Email Operations
75
76
Provides comprehensive IMAP server connectivity with SSL/TLS support, email searching, attachment detection, and secure file downloads with path traversal protection.
77
78
```python { .api }
79
class ImapHook:
80
def __init__(self, imap_conn_id: str = "imap_default") -> None: ...
81
def get_conn(self) -> ImapHook: ...
82
def has_mail_attachment(
83
self,
84
name: str,
85
*,
86
check_regex: bool = False,
87
mail_folder: str = "INBOX",
88
mail_filter: str = "All"
89
) -> bool: ...
90
def retrieve_mail_attachments(
91
self,
92
name: str,
93
*,
94
check_regex: bool = False,
95
latest_only: bool = False,
96
mail_folder: str = "INBOX",
97
mail_filter: str = "All",
98
not_found_mode: str = "raise",
99
) -> list[tuple]: ...
100
def download_mail_attachments(
101
self,
102
name: str,
103
local_output_directory: str,
104
*,
105
check_regex: bool = False,
106
latest_only: bool = False,
107
mail_folder: str = "INBOX",
108
mail_filter: str = "All",
109
not_found_mode: str = "raise",
110
) -> None: ...
111
```
112
113
[IMAP Hooks](./imap-hooks.md)
114
115
### Email Attachment Monitoring
116
117
Sensor operators that monitor email mailboxes for specific attachments, enabling event-driven workflows triggered by incoming emails.
118
119
```python { .api }
120
class ImapAttachmentSensor:
121
def __init__(
122
self,
123
*,
124
attachment_name,
125
check_regex=False,
126
mail_folder="INBOX",
127
mail_filter="All",
128
conn_id="imap_default",
129
**kwargs,
130
) -> None: ...
131
def poke(self, context) -> bool: ...
132
```
133
134
[IMAP Sensors](./imap-sensors.md)
135
136
## Connection Configuration
137
138
The IMAP provider uses Airflow connections with the connection type `imap`:
139
140
- **Host**: IMAP server hostname (e.g., `imap.gmail.com`)
141
- **Port**: Server port (optional, defaults to standard IMAP ports)
142
- **Login**: Username for authentication
143
- **Password**: Password for authentication
144
- **Extra**: JSON configuration for additional options
145
146
### SSL Configuration
147
148
```json
149
{
150
"use_ssl": true,
151
"ssl_context": "default"
152
}
153
```
154
155
- `use_ssl` (bool): Enable SSL/TLS connection (default: `true`)
156
- `ssl_context` (str): SSL context setting (`"default"` or `"none"`)
157
158
## Error Handling
159
160
The provider includes comprehensive error handling:
161
162
- **AirflowException**: Raised when attachments are not found (configurable)
163
- **RuntimeError**: Raised for SSL configuration errors or connection issues
164
- **Path Security**: Automatic protection against directory traversal attacks
165
- **File Security**: Symlink detection and blocking for downloaded files
166
167
### Error Modes
168
169
Attachment operations support configurable error handling:
170
171
- `"raise"`: Raise `AirflowException` if attachment not found
172
- `"warn"`: Log warning message if attachment not found
173
- `"ignore"`: Silent operation if attachment not found