or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-flink--flink-streaming-scala_2-11

Apache Flink Streaming Scala API provides elegant and fluent Scala APIs for building high-throughput, low-latency stream processing applications with fault-tolerance and exactly-once processing guarantees

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-streaming-scala_2.11@1.14.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-flink--flink-streaming-scala_2-11@1.14.0

0

# Apache Flink Streaming Scala API

1

2

Apache Flink Streaming Scala API provides elegant and fluent Scala APIs for building high-throughput, low-latency stream processing applications with fault-tolerance and exactly-once processing guarantees. This library wraps Flink's Java DataStream API with Scala-idiomatic interfaces, offering type-safe streaming data processing with functional programming constructs.

3

4

## Package Information

5

6

- **Package Name**: flink-streaming-scala_2.11

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Installation**: Add to your Maven `pom.xml`:

10

11

```xml

12

<dependency>

13

<groupId>org.apache.flink</groupId>

14

<artifactId>flink-streaming-scala_2.11</artifactId>

15

<version>1.14.6</version>

16

</dependency>

17

```

18

19

For SBT:

20

```scala

21

libraryDependencies += "org.apache.flink" %% "flink-streaming-scala_2.11" % "1.14.6"

22

```

23

24

## Core Imports

25

26

```scala

27

import org.apache.flink.streaming.api.scala._

28

import org.apache.flink.streaming.api.scala.StreamExecutionEnvironment

29

```

30

31

## Basic Usage

32

33

```scala

34

import org.apache.flink.streaming.api.scala._

35

36

// Create execution environment

37

val env = StreamExecutionEnvironment.getExecutionEnvironment

38

39

// Create a data stream from a collection

40

val dataStream = env.fromCollection(List(1, 2, 3, 4, 5))

41

42

// Transform the data

43

val result = dataStream

44

.map(_ * 2)

45

.filter(_ > 5)

46

.keyBy(identity)

47

.sum(0)

48

49

// Add sink and execute

50

result.print()

51

env.execute("Basic Flink Job")

52

```

53

54

## Architecture

55

56

The Flink Streaming Scala API is built around several key components:

57

58

- **Execution Environment**: Entry point for creating and configuring streaming jobs

59

- **DataStream**: Core abstraction for unbounded streams of data with transformation operations

60

- **Keyed Streams**: Partitioned streams enabling stateful operations and aggregations

61

- **Windowing**: Time-based and count-based grouping of stream elements

62

- **Functions**: User-defined functions for custom processing logic

63

- **Type System**: Automatic TypeInformation generation for Scala types

64

65

## Capabilities

66

67

### Stream Execution Environment

68

69

Main entry point for creating streaming applications and configuring execution parameters like parallelism, checkpointing, and state backends.

70

71

```scala { .api }

72

class StreamExecutionEnvironment(javaEnv: JavaEnv) {

73

def getExecutionEnvironment: StreamExecutionEnvironment

74

def setParallelism(parallelism: Int): Unit

75

def enableCheckpointing(interval: Long): StreamExecutionEnvironment

76

}

77

```

78

79

[Stream Execution Environment](./stream-execution-environment.md)

80

81

### Data Stream Operations

82

83

Core streaming operations for transforming, filtering, and processing unbounded data streams with type safety and functional programming patterns.

84

85

```scala { .api }

86

class DataStream[T] {

87

def map[R](mapper: T => R): DataStream[R]

88

def filter(predicate: T => Boolean): DataStream[T]

89

def keyBy[K](keySelector: T => K): KeyedStream[T, K]

90

def union(otherStreams: DataStream[T]*): DataStream[T]

91

}

92

```

93

94

[Data Stream](./data-stream.md)

95

96

### Keyed Stream Operations

97

98

Partitioned stream operations enabling stateful computations, aggregations, and key-based processing with state management.

99

100

```scala { .api }

101

class KeyedStream[T, K] {

102

def sum(field: Int): DataStream[T]

103

def reduce(reducer: (T, T) => T): DataStream[T]

104

def window[W <: Window](assigner: WindowAssigner[T, W]): WindowedStream[T, K, W]

105

def process[R](function: KeyedProcessFunction[K, T, R]): DataStream[R]

106

}

107

```

108

109

[Keyed Stream](./keyed-stream.md)

110

111

### Windowing Operations

112

113

Time-based and count-based grouping of stream elements for aggregations and computations over bounded sets of data.

114

115

```scala { .api }

116

class WindowedStream[T, K, W <: Window] {

117

def reduce(reducer: (T, T) => T): DataStream[T]

118

def aggregate[ACC, R](aggregateFunction: AggregateFunction[T, ACC, R]): DataStream[R]

119

def apply[R](function: WindowFunction[T, R, K, W]): DataStream[R]

120

}

121

```

122

123

[Windowing](./windowing.md)

124

125

### Stream Joining Operations

126

127

Operations for combining multiple data streams based on keys, time windows, or custom join conditions.

128

129

```scala { .api }

130

class JoinedStreams[T1, T2] {

131

def where[KEY](keySelector: T1 => KEY): Where[T1, T2, KEY]

132

def equalTo[KEY](keySelector: T2 => KEY): EqualTo[T1, T2, KEY]

133

def window[W <: Window](assigner: WindowAssigner[TaggedUnion[T1, T2], W]): WithWindow[T1, T2, KEY, W]

134

}

135

```

136

137

[Joining](./joining.md)

138

139

### Async Operations

140

141

Asynchronous I/O operations for non-blocking external system interactions with configurable timeouts and capacity management.

142

143

```scala { .api }

144

object AsyncDataStream {

145

def unorderedWait[IN, OUT](

146

stream: DataStream[IN],

147

function: AsyncFunction[IN, OUT],

148

timeout: Long,

149

timeUnit: TimeUnit

150

): DataStream[OUT]

151

}

152

```

153

154

[Async Operations](./async-operations.md)

155

156

### User-Defined Functions

157

158

Interfaces for creating custom processing functions including window functions, process functions, and rich functions with lifecycle management.

159

160

```scala { .api }

161

trait ProcessFunction[I, O] {

162

def processElement(value: I, ctx: ProcessFunction.Context, out: Collector[O]): Unit

163

}

164

165

trait WindowFunction[IN, OUT, KEY, W <: Window] {

166

def apply(key: KEY, window: W, input: Iterable[IN], out: Collector[OUT]): Unit

167

}

168

```

169

170

[Functions](./functions.md)

171

172

## Types

173

174

```scala { .api }

175

// Core stream types

176

class DataStream[T]

177

class KeyedStream[T, K]

178

class WindowedStream[T, K, W <: Window]

179

class AllWindowedStream[T, W <: Window]

180

class ConnectedStreams[T1, T2]

181

class BroadcastConnectedStream[IN1, IN2]

182

183

// Environment and configuration

184

class StreamExecutionEnvironment

185

class ExecutionConfig

186

class CheckpointConfig

187

188

// Output and utility types

189

class OutputTag[T]

190

trait CloseableIterator[T]

191

```