Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities
npx @tessl/cli install tessl/maven-org-apache-flink--flink-parent@2.1.00
# Apache Flink
1
2
Apache Flink is a distributed stream processing framework that provides unified batch and stream processing capabilities with low-latency, high-throughput data processing. It offers elegant APIs in Java for building streaming and batch applications, supports event time processing with exactly-once guarantees, provides flexible windowing mechanisms, and includes advanced features like fault tolerance, natural back-pressure, and custom memory management.
3
4
## Package Information
5
6
- **Package Name**: flink-parent
7
- **Package Type**: maven
8
- **Language**: Java
9
- **Group ID**: org.apache.flink
10
- **Artifact ID**: flink-parent
11
- **Version**: 2.1.0
12
13
### Installation
14
15
Add the appropriate Flink dependencies to your `pom.xml`:
16
17
**DataStream API (Traditional):**
18
```xml
19
<dependency>
20
<groupId>org.apache.flink</groupId>
21
<artifactId>flink-streaming-java</artifactId>
22
<version>2.1.0</version>
23
</dependency>
24
```
25
26
**DataStream API (New v2):**
27
```xml
28
<dependency>
29
<groupId>org.apache.flink</groupId>
30
<artifactId>flink-datastream-api</artifactId>
31
<version>2.1.0</version>
32
</dependency>
33
```
34
35
**Table API & SQL:**
36
```xml
37
<dependency>
38
<groupId>org.apache.flink</groupId>
39
<artifactId>flink-table-api-java</artifactId>
40
<version>2.1.0</version>
41
</dependency>
42
<dependency>
43
<groupId>org.apache.flink</groupId>
44
<artifactId>flink-table-runtime</artifactId>
45
<version>2.1.0</version>
46
</dependency>
47
```
48
49
**Core APIs:**
50
```xml
51
<dependency>
52
<groupId>org.apache.flink</groupId>
53
<artifactId>flink-core</artifactId>
54
<version>2.1.0</version>
55
</dependency>
56
<dependency>
57
<groupId>org.apache.flink</groupId>
58
<artifactId>flink-core-api</artifactId>
59
<version>2.1.0</version>
60
</dependency>
61
```
62
63
**For complete applications, also include:**
64
```xml
65
<dependency>
66
<groupId>org.apache.flink</groupId>
67
<artifactId>flink-clients</artifactId>
68
<version>2.1.0</version>
69
</dependency>
70
```
71
72
## Core Imports
73
74
For DataStream API (new v2):
75
```java
76
import org.apache.flink.datastream.api.ExecutionEnvironment;
77
import org.apache.flink.datastream.api.stream.DataStream;
78
import org.apache.flink.datastream.api.function.OneInputStreamProcessFunction;
79
```
80
81
For traditional DataStream API:
82
```java
83
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
84
import org.apache.flink.streaming.api.datastream.DataStream;
85
import org.apache.flink.streaming.api.functions.ProcessFunction;
86
```
87
88
For Table API:
89
```java
90
import org.apache.flink.table.api.TableEnvironment;
91
import org.apache.flink.table.api.Table;
92
```
93
94
## Basic Usage
95
96
### Stream Processing Example
97
98
```java
99
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
100
import org.apache.flink.streaming.api.datastream.DataStream;
101
import org.apache.flink.api.common.functions.MapFunction;
102
103
// Create execution environment
104
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
105
106
// Create a data stream from a source
107
DataStream<String> stream = env.socketTextStream("localhost", 9999);
108
109
// Transform the data
110
DataStream<String> transformed = stream
111
.map(new MapFunction<String, String>() {
112
@Override
113
public String map(String value) throws Exception {
114
return value.toUpperCase();
115
}
116
});
117
118
// Add a sink
119
transformed.print();
120
121
// Execute the program
122
env.execute("Basic Stream Processing");
123
```
124
125
### Table API Example
126
127
```java
128
import org.apache.flink.table.api.TableEnvironment;
129
import org.apache.flink.table.api.Table;
130
131
// Create table environment
132
TableEnvironment tableEnv = TableEnvironment.create();
133
134
// Create table from source
135
tableEnv.executeSql("CREATE TABLE Orders (" +
136
"order_id BIGINT, " +
137
"product STRING, " +
138
"amount DECIMAL(10,2)" +
139
") WITH (...)");
140
141
// Query the table
142
Table result = tableEnv.sqlQuery(
143
"SELECT product, SUM(amount) as total_amount " +
144
"FROM Orders " +
145
"GROUP BY product"
146
);
147
148
// Execute and print results
149
result.execute().print();
150
```
151
152
## Architecture
153
154
Apache Flink is built around several key architectural components:
155
156
- **Execution Environments**: Entry points for creating Flink programs (`StreamExecutionEnvironment`, `TableEnvironment`)
157
- **DataStream API**: Two generations - traditional (`flink-streaming-java`) and new v2 (`flink-datastream-api`)
158
- **Table API & SQL**: Declarative programming model for relational operations
159
- **State Management**: Both synchronous and asynchronous state APIs for fault-tolerant stateful processing
160
- **Windowing System**: Complete event-time and processing-time windowing with triggers and evictors
161
- **Type System**: Strong type safety with generic type preservation and serialization
162
- **Connector Framework**: Unified source and sink abstractions for data integration
163
- **Execution Runtime**: Distributed execution with fault tolerance, checkpointing, and savepoints
164
165
## Capabilities
166
167
### Core Functions & Types
168
169
Fundamental function interfaces and type system that form the building blocks for all Flink applications. Includes user-defined functions, tuple system, and core abstractions.
170
171
```java { .api }
172
// Core function interfaces
173
interface Function {}
174
interface MapFunction<T, O> extends Function {
175
O map(T value) throws Exception;
176
}
177
interface ReduceFunction<T> extends Function {
178
T reduce(T value1, T value2) throws Exception;
179
}
180
181
// Tuple system
182
class Tuple2<T0, T1> {
183
public T0 f0;
184
public T1 f1;
185
public Tuple2(T0 f0, T1 f1);
186
}
187
```
188
189
[Core Functions & Types](./core-functions.md)
190
191
### State Management
192
193
Comprehensive state management API supporting both synchronous and asynchronous operations. Includes value state, list state, map state, and specialized state types for different use cases.
194
195
```java { .api }
196
interface State {}
197
interface ValueState<T> extends State {
198
T value() throws Exception;
199
void update(T value) throws Exception;
200
}
201
interface ListState<T> extends State {
202
Iterable<T> get() throws Exception;
203
void add(T value) throws Exception;
204
}
205
```
206
207
[State Management](./state-management.md)
208
209
### DataStream API (New v2)
210
211
Next-generation DataStream API with improved type safety, better performance, and enhanced functionality. Provides streamlined programming model for stream processing applications.
212
213
```java { .api }
214
interface ExecutionEnvironment {
215
<T> DataStream<T> fromSource(Source<T> source);
216
}
217
interface DataStream<T> {
218
<OUT> DataStream<OUT> process(OneInputStreamProcessFunction<T, OUT> function);
219
KeyedPartitionStream<K, T> keyBy(KeySelector<T, K> keySelector);
220
}
221
```
222
223
[DataStream API v2](./datastream-v2.md)
224
225
### DataStream API (Traditional)
226
227
Traditional DataStream API providing comprehensive stream processing capabilities with windowing, state management, and complex event processing features.
228
229
```java { .api }
230
class StreamExecutionEnvironment {
231
static StreamExecutionEnvironment getExecutionEnvironment();
232
<T> DataStream<T> addSource(SourceFunction<T> function);
233
JobExecutionResult execute(String jobName) throws Exception;
234
}
235
```
236
237
[DataStream API Traditional](./datastream-traditional.md)
238
239
### Table API & SQL
240
241
Declarative programming model for relational data processing with SQL support, catalog integration, and comprehensive type system for structured data operations.
242
243
```java { .api }
244
interface TableEnvironment {
245
Table sqlQuery(String query);
246
TableResult executeSql(String statement);
247
Table from(String path);
248
}
249
interface Table {
250
Table select(Expression... fields);
251
Table where(Expression predicate);
252
TableResult execute();
253
}
254
```
255
256
[Table API & SQL](./table-api.md)
257
258
### Windowing System
259
260
Complete windowing system for time-based and count-based data aggregation, supporting event time and processing time semantics with customizable triggers and evictors.
261
262
```java { .api }
263
class TumblingEventTimeWindows extends WindowAssigner<Object, TimeWindow> {
264
static TumblingEventTimeWindows of(Time size);
265
}
266
class SlidingEventTimeWindows extends WindowAssigner<Object, TimeWindow> {
267
static SlidingEventTimeWindows of(Time size, Time slide);
268
}
269
```
270
271
[Windowing System](./windowing.md)
272
273
### Connector Framework
274
275
Unified connector framework for integrating with external systems, supporting both source and sink operations with exactly-once processing guarantees.
276
277
```java { .api }
278
interface Source<T> {
279
SourceReader<T, ?> createReader(SourceReaderContext readerContext);
280
}
281
interface Sink<InputT> {
282
SinkWriter<InputT> createWriter(WriterInitContext context);
283
}
284
```
285
286
[Connector Framework](./connectors.md)
287
288
### Configuration & Utilities
289
290
Configuration system and utility classes for program execution, memory management, parameter handling, and system integration.
291
292
```java { .api }
293
class Configuration {
294
<T> T get(ConfigOption<T> option);
295
<T> void set(ConfigOption<T> option, T value);
296
}
297
class MemorySize {
298
static MemorySize parse(String text);
299
long getBytes();
300
}
301
```
302
303
[Configuration & Utilities](./configuration.md)