or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-airflow

Core operators and hooks for Apache Airflow workflow orchestration including BashOperator, PythonOperator, EmailOperator, and essential database and HTTP connectivity

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/airflow@1.6.x

To install, run

npx @tessl/cli install tessl/pypi-airflow@1.6.0

0

# Apache Airflow

1

2

A platform to programmatically author, schedule and monitor workflows. Airflow allows you to author workflows as Directed Acyclic Graphs (DAGs) of tasks, execute them on an array of workers while following specified dependencies, and provides rich monitoring and troubleshooting capabilities.

3

4

## Package Information

5

6

- **Package Name**: airflow

7

- **Version**: 1.6.0

8

- **Language**: Python

9

- **Installation**: `pip install airflow`

10

11

## Core Imports

12

13

Basic operator imports:

14

15

```python

16

from airflow.operators.bash_operator import BashOperator

17

from airflow.operators.python_operator import PythonOperator, BranchPythonOperator, ShortCircuitOperator

18

from airflow.operators.dummy_operator import DummyOperator

19

from airflow.operators.email_operator import EmailOperator

20

from airflow.operators.mysql_operator import MySqlOperator

21

from airflow.operators.postgres_operator import PostgresOperator

22

from airflow.operators.sqlite_operator import SqliteOperator

23

from airflow.operators.http_operator import SimpleHttpOperator

24

from airflow.operators.subdag_operator import SubDagOperator

25

from airflow.operators.sensors import BaseSensorOperator, SqlSensor, HdfsSensor

26

```

27

28

Hook imports:

29

30

```python

31

from airflow.hooks.base_hook import BaseHook

32

from airflow.hooks.dbapi_hook import DbApiHook

33

from airflow.hooks.http_hook import HttpHook

34

```

35

36

Core utilities:

37

38

```python

39

from airflow.models import BaseOperator, DAG

40

from airflow.utils import State, TriggerRule, AirflowException, apply_defaults

41

```

42

43

## Basic Usage

44

45

```python

46

from datetime import datetime, timedelta

47

from airflow import DAG

48

from airflow.operators.bash_operator import BashOperator

49

from airflow.operators.python_operator import PythonOperator

50

from airflow.operators.dummy_operator import DummyOperator

51

52

# Define DAG

53

default_args = {

54

'owner': 'airflow',

55

'depends_on_past': False,

56

'start_date': datetime(2023, 1, 1),

57

'retries': 1,

58

'retry_delay': timedelta(minutes=5)

59

}

60

61

dag = DAG(

62

'example_workflow',

63

default_args=default_args,

64

schedule_interval=timedelta(days=1),

65

catchup=False

66

)

67

68

# Define tasks

69

start_task = DummyOperator(

70

task_id='start',

71

dag=dag

72

)

73

74

def process_data(**context):

75

print(f"Processing data for {context['ds']}")

76

return "Data processed successfully"

77

78

python_task = PythonOperator(

79

task_id='process_data',

80

python_callable=process_data,

81

provide_context=True,

82

dag=dag

83

)

84

85

bash_task = BashOperator(

86

task_id='cleanup',

87

bash_command='echo "Cleanup completed"',

88

dag=dag

89

)

90

91

# Set dependencies

92

start_task >> python_task >> bash_task

93

```

94

95

## Architecture

96

97

Apache Airflow's architecture centers around several key concepts:

98

99

- **Operators**: Define atomic units of work (tasks) in workflows

100

- **Hooks**: Provide interfaces to external systems and services

101

- **DAGs**: Directed Acyclic Graphs that define workflow structure and dependencies

102

- **Task Instances**: Runtime representations of tasks with execution context

103

104

**Operators vs Hooks**: Operators focus on task execution and workflow logic, while hooks abstract external system connectivity. Operators often use hooks internally for data source interactions.

105

106

**Task Lifecycle**: Tasks progress through states (queued → running → success/failed) with support for retries, upstream dependency checking, and conditional execution via trigger rules.

107

108

## Capabilities

109

110

### Core Task Operators

111

112

Essential operators for task execution including bash commands, Python functions, workflow branching, and email notifications. These operators form the building blocks of most Airflow workflows.

113

114

```python { .api }

115

class BashOperator(BaseOperator):

116

def __init__(self, bash_command, xcom_push=False, env=None, **kwargs): ...

117

118

class PythonOperator(BaseOperator):

119

def __init__(self, python_callable, op_args=[], op_kwargs={}, provide_context=False, **kwargs): ...

120

121

class DummyOperator(BaseOperator):

122

def __init__(self, **kwargs): ...

123

124

class EmailOperator(BaseOperator):

125

def __init__(self, to, subject, html_content, files=None, **kwargs): ...

126

```

127

128

[Core Operators](./operators.md)

129

130

### Database Operations

131

132

SQL execution operators for various database systems including MySQL, PostgreSQL, and SQLite with connection management, parameter binding, and transaction control.

133

134

```python { .api }

135

class MySqlOperator(BaseOperator):

136

def __init__(self, sql, mysql_conn_id='mysql_default', parameters=None, **kwargs): ...

137

138

class PostgresOperator(BaseOperator):

139

def __init__(self, sql, postgres_conn_id='postgres_default', autocommit=False, parameters=None, **kwargs): ...

140

141

class SqliteOperator(BaseOperator):

142

def __init__(self, sql, sqlite_conn_id='sqlite_default', parameters=None, **kwargs): ...

143

```

144

145

[Database Operators](./operators.md#database-operations)

146

147

### HTTP and Web Operations

148

149

Execute HTTP requests against external APIs and web services with response validation and customizable request parameters.

150

151

```python { .api }

152

class SimpleHttpOperator(BaseOperator):

153

def __init__(self, endpoint, method='POST', data=None, headers=None,

154

response_check=None, http_conn_id='http_default', **kwargs): ...

155

```

156

157

[HTTP Operations](./operators.md#http-operations)

158

159

### Sensor Operations

160

161

Monitor external systems and wait for conditions to be met with configurable polling intervals and timeout handling.

162

163

```python { .api }

164

class BaseSensorOperator(BaseOperator):

165

def __init__(self, poke_interval=60, timeout=60*60*24*7, **kwargs): ...

166

167

class SqlSensor(BaseSensorOperator):

168

def __init__(self, conn_id, sql, **kwargs): ...

169

170

class HdfsSensor(BaseSensorOperator):

171

def __init__(self, filepath, hdfs_conn_id='hdfs_default', **kwargs): ...

172

```

173

174

[Sensor Operations](./operators.md#sensor-operations)

175

176

### Workflow Composition

177

178

Create complex workflows with sub-DAGs for modular and reusable workflow components.

179

180

```python { .api }

181

class SubDagOperator(BaseOperator):

182

def __init__(self, subdag, executor=DEFAULT_EXECUTOR, **kwargs): ...

183

```

184

185

[Workflow Composition](./operators.md#workflow-composition)

186

187

### System Integration Hooks

188

189

Hooks provide standardized interfaces for connecting to external systems including databases, HTTP APIs, and custom services with built-in connection management and error handling.

190

191

```python { .api }

192

class BaseHook:

193

@classmethod

194

def get_connection(cls, conn_id): ...

195

@classmethod

196

def get_connections(cls, conn_id): ...

197

198

class DbApiHook(BaseHook):

199

def get_records(self, sql, parameters=None): ...

200

def run(self, sql, autocommit=False, parameters=None): ...

201

202

class HttpHook(BaseHook):

203

def run(self, endpoint, data=None, headers=None, extra_options=None): ...

204

```

205

206

[System Hooks](./hooks.md)

207

208

### Foundation Classes and Utilities

209

210

Base classes and utility functions that provide the core framework for operator development, state management, error handling, and workflow control.

211

212

```python { .api }

213

class BaseOperator:

214

def __init__(self, task_id, owner='airflow', retries=0, **kwargs): ...

215

def execute(self, context): ...

216

217

class State:

218

QUEUED = "queued"

219

RUNNING = "running"

220

SUCCESS = "success"

221

FAILED = "failed"

222

223

@apply_defaults

224

def operator_constructor(self, **kwargs): ...

225

```

226

227

[Core Framework](./core.md)