Provider package for Apache Airflow that enables FTP file transfer protocol operations including hooks, operators, and sensors for workflow integration.
npx @tessl/cli install tessl/pypi-apache-airflow-providers-ftp@3.13.00
# Apache Airflow FTP Provider
1
2
A comprehensive provider package for Apache Airflow that enables File Transfer Protocol (FTP) operations within data workflows. This provider includes hooks for managing FTP connections, operators for file transfer tasks, and sensors for monitoring file availability, making it essential for ETL pipelines and data integration workflows that require reliable file transfer capabilities.
3
4
## Package Information
5
6
- **Package Name**: apache-airflow-providers-ftp
7
- **Language**: Python
8
- **Installation**: `pip install apache-airflow-providers-ftp`
9
- **Minimum Airflow Version**: 2.10.0
10
- **Python Versions**: 3.10, 3.11, 3.12, 3.13
11
12
## Core Imports
13
14
```python
15
# Hook imports for FTP connections
16
from airflow.providers.ftp.hooks.ftp import FTPHook, FTPSHook
17
18
# Operator imports for file transfers
19
from airflow.providers.ftp.operators.ftp import FTPFileTransmitOperator, FTPSFileTransmitOperator, FTPOperation
20
21
# Sensor imports for file monitoring
22
from airflow.providers.ftp.sensors.ftp import FTPSensor, FTPSSensor
23
```
24
25
## Basic Usage
26
27
```python
28
from airflow import DAG
29
from airflow.providers.ftp.hooks.ftp import FTPHook
30
from airflow.providers.ftp.operators.ftp import FTPFileTransmitOperator
31
from airflow.providers.ftp.sensors.ftp import FTPSensor
32
from datetime import datetime
33
34
# Using FTP Hook directly in a task
35
def transfer_files_with_hook():
36
hook = FTPHook(ftp_conn_id='my_ftp_connection')
37
38
# Download a file
39
hook.retrieve_file('/remote/path/file.txt', '/local/path/file.txt')
40
41
# Upload a file
42
hook.store_file('/remote/path/upload.txt', '/local/path/upload.txt')
43
44
# List directory contents
45
files = hook.list_directory('/remote/directory')
46
return files
47
48
# Using FTP Operator in a DAG
49
dag = DAG('ftp_example', start_date=datetime(2023, 1, 1))
50
51
# Wait for file to appear on FTP server
52
wait_for_file = FTPSensor(
53
task_id='wait_for_data_file',
54
path='/remote/data/input.csv',
55
ftp_conn_id='my_ftp_connection',
56
dag=dag
57
)
58
59
# Download the file when available
60
download_file = FTPFileTransmitOperator(
61
task_id='download_data_file',
62
ftp_conn_id='my_ftp_connection',
63
operation=FTPOperation.GET,
64
remote_filepath='/remote/data/input.csv',
65
local_filepath='/local/data/input.csv',
66
dag=dag
67
)
68
69
wait_for_file >> download_file
70
```
71
72
## Architecture
73
74
The FTP provider follows Apache Airflow's standard provider architecture:
75
76
- **Hooks**: Low-level connection and operation management
77
- **Operators**: Task execution with Airflow integration and templating
78
- **Sensors**: Conditional task triggering based on FTP resource availability
79
- **Connection Management**: Integration with Airflow's connection system for credential management
80
81
All components support both standard FTP and secure FTPS (FTP over SSL/TLS) protocols.
82
83
## Capabilities
84
85
### FTP Connection Management
86
87
Low-level hook classes for establishing and managing FTP connections with authentication, SSL support, and comprehensive file operations.
88
89
```python { .api }
90
class FTPHook(BaseHook):
91
def __init__(self, ftp_conn_id: str = "ftp_default") -> None: ...
92
def get_conn(self) -> ftplib.FTP: ...
93
def close_conn(self) -> None: ...
94
def test_connection(self) -> tuple[bool, str]: ...
95
96
class FTPSHook(FTPHook):
97
def get_conn(self) -> ftplib.FTP: ...
98
```
99
100
[FTP Hooks](./ftp-hooks.md)
101
102
### File Transfer Operations
103
104
Operator classes for performing file uploads, downloads, and transfers between local and remote FTP servers with support for directory creation and batch operations.
105
106
```python { .api }
107
class FTPFileTransmitOperator(BaseOperator):
108
def __init__(
109
self,
110
*,
111
ftp_conn_id: str = "ftp_default",
112
local_filepath: str | list[str],
113
remote_filepath: str | list[str],
114
operation: str = FTPOperation.PUT,
115
create_intermediate_dirs: bool = False,
116
**kwargs
117
) -> None: ...
118
119
class FTPSFileTransmitOperator(FTPFileTransmitOperator): ...
120
121
class FTPOperation:
122
PUT = "put"
123
GET = "get"
124
```
125
126
[FTP Operators](./ftp-operators.md)
127
128
### File Monitoring
129
130
Sensor classes for waiting and monitoring file or directory availability on FTP servers with configurable error handling and retry logic.
131
132
```python { .api }
133
class FTPSensor(BaseSensorOperator):
134
def __init__(
135
self,
136
*,
137
path: str,
138
ftp_conn_id: str = "ftp_default",
139
fail_on_transient_errors: bool = True,
140
**kwargs
141
) -> None: ...
142
143
class FTPSSensor(FTPSensor): ...
144
```
145
146
[FTP Sensors](./ftp-sensors.md)
147
148
## Connection Configuration
149
150
The FTP provider uses Airflow's connection system. Configure FTP connections in the Airflow UI or programmatically:
151
152
- **Connection Type**: `ftp`
153
- **Host**: FTP server hostname or IP address
154
- **Port**: FTP server port (default: 21)
155
- **Login**: Username for authentication
156
- **Password**: Password for authentication
157
- **Extra**: JSON configuration with optional parameters:
158
- `"passive": true/false` - Enable/disable passive mode (default: true)
159
160
## Types
161
162
```python { .api }
163
# Connection context manager support
164
FTPHook.__enter__() -> FTPHook
165
FTPHook.__exit__(exc_type: Any, exc_val: Any, exc_tb: Any) -> None
166
167
# Error handling types
168
tuple[bool, str] # test_connection return type
169
```