Apache Flink streaming examples demonstrating various stream processing patterns and use cases
—
Basic streaming word count implementations demonstrating fundamental DataStream operations, tuple-based processing, and POJO-based alternatives.
Classic streaming word count example processing text files or default sample data.
/**
* Streaming word count program that computes word occurrence histogram
* @param args Command line arguments (--input path, --output path)
*/
public class WordCount {
public static void main(String[] args) throws Exception;
}Usage Example:
// Running with file input/output
java -cp flink-examples-streaming_2.10-1.3.3.jar \
org.apache.flink.streaming.examples.wordcount.WordCount \
--input /path/to/input.txt \
--output /path/to/output.txt
// Running with default sample data (prints to stdout)
java -cp flink-examples-streaming_2.10-1.3.3.jar \
org.apache.flink.streaming.examples.wordcount.WordCountUser-defined function that splits text into word-count pairs for stream processing.
/**
* Splits sentences into words as a FlatMapFunction
* Takes a line (String) and splits it into multiple (word,1) pairs
*/
public static final class Tokenizer
implements FlatMapFunction<String, Tuple2<String, Integer>> {
/**
* Tokenizes input text and emits word-count pairs
* @param value Input text line
* @param out Collector for emitting Tuple2<word, count> pairs
*/
public void flatMap(String value, Collector<Tuple2<String, Integer>> out)
throws Exception;
}The tokenizer:
\\W+Tuple2<String, Integer>(word, 1)Alternative word count implementation using Plain Old Java Objects (POJOs) instead of tuples.
/**
* Word count example demonstrating POJO usage instead of tuples
* @param args Command line arguments (--input path, --output path)
*/
public class PojoExample {
public static void main(String[] args) throws Exception;
}Usage Example:
// POJO-based word count
java -cp flink-examples-streaming_2.10-1.3.3.jar \
org.apache.flink.streaming.examples.wordcount.PojoExample \
--input /path/to/text.txt \
--output /path/to/results.txtStreamExecutionEnvironment.getExecutionEnvironment()env.readTextFile(path)env.fromElements()flatMap(), keyBy(), and sum()final ParameterTool params = ParameterTool.fromArgs(args);
env.getConfig().setGlobalJobParameters(params);
// Check for input parameter
if (params.has("input")) {
text = env.readTextFile(params.get("input"));
} else {
// Use default data
text = env.fromElements(WordCountData.WORDS);
}// File output
if (params.has("output")) {
counts.writeAsText(params.get("output"));
} else {
// Console output
counts.print();
}
// Execute the job
env.execute("Streaming WordCount");DataStream<Tuple2<String, Integer>> counts = text
.flatMap(new Tokenizer()) // Split into words
.keyBy(0) // Group by word (field 0)
.sum(1); // Sum counts (field 1)<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.10</artifactId>
<version>1.3.3</version>
</dependency>
<!-- For WordCountData sample data -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-examples-batch_2.10</artifactId>
<version>1.3.3</version>
</dependency>import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.utils.ParameterTool;
import org.apache.flink.examples.java.wordcount.util.WordCountData;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;Install with Tessl CLI
npx tessl i tessl/maven-org-apache-flink--flink-examples-streaming-2-10