or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/maven-org-apache-flink--flink-walkthrough-datastream-scala

Maven archetype for building fraud detection applications using Apache Flink DataStream API with Scala

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-walkthrough-datastream-scala@1.16.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-flink--flink-walkthrough-datastream-scala@1.16.0

index.mddocs/

Flink Walkthrough DataStream Scala

Apache Flink DataStream Scala walkthrough is a Maven archetype that provides a complete template for building fraud detection applications using Apache Flink's DataStream API with Scala. This archetype generates skeleton code that developers can customize to implement real-time stream processing applications for fraud detection and similar use cases.

Package Information

  • Package Name: flink-walkthrough-datastream-scala
  • Package Type: maven archetype
  • Group ID: org.apache.flink
  • Language: Scala
  • Installation: mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-walkthrough-datastream-scala -DarchetypeVersion=1.16.3

Core Imports

The generated fraud detection application uses the following core imports:

import org.apache.flink.streaming.api.scala._
import org.apache.flink.streaming.api.functions.KeyedProcessFunction
import org.apache.flink.util.Collector
import org.apache.flink.walkthrough.common.sink.AlertSink
import org.apache.flink.walkthrough.common.entity.{Alert, Transaction}
import org.apache.flink.walkthrough.common.source.TransactionSource

Core Usage

Generate a new fraud detection project from the archetype:

mvn archetype:generate \
  -DarchetypeGroupId=org.apache.flink \
  -DarchetypeArtifactId=flink-walkthrough-datastream-scala \
  -DarchetypeVersion=1.16.3 \
  -DgroupId=com.example \
  -DartifactId=my-fraud-detection \
  -Dversion=1.0-SNAPSHOT \
  -Dpackage=com.example.fraud

Basic Usage

After generating the project, the archetype provides a complete skeleton for a fraud detection application:

// Generated FraudDetectionJob.scala
import org.apache.flink.streaming.api.scala._
import org.apache.flink.walkthrough.common.sink.AlertSink
import org.apache.flink.walkthrough.common.entity.{Alert, Transaction}
import org.apache.flink.walkthrough.common.source.TransactionSource

object FraudDetectionJob {
  def main(args: Array[String]): Unit = {
    val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment

    val transactions: DataStream[Transaction] = env
      .addSource(new TransactionSource)
      .name("transactions")

    val alerts: DataStream[Alert] = transactions
      .keyBy(transaction => transaction.getAccountId)
      .process(new FraudDetector)
      .name("fraud-detector")

    alerts
      .addSink(new AlertSink)
      .name("send-alerts")

    env.execute("Fraud Detection")
  }
}

Architecture

The archetype generates a complete fraud detection application structure:

  • Main Application: FraudDetectionJob object serving as the entry point
  • Stream Processing Logic: FraudDetector class implementing the fraud detection algorithm
  • Data Sources: Integration with TransactionSource for transaction data streams
  • Data Sinks: Integration with AlertSink for outputting fraud alerts
  • Maven Configuration: Complete POM with Flink dependencies and build configuration

Capabilities

Main Application Entry Point

The primary application class that sets up the Flink streaming environment and defines the data processing pipeline.

object FraudDetectionJob {
  /**
   * Main entry point for the fraud detection streaming application
   * @param args Command line arguments passed to the application
   * @throws Exception if the streaming job fails to execute
   */
  @throws[Exception]
  def main(args: Array[String]): Unit
}

Fraud Detection Processing Function

A stateful stream processing function that analyzes transaction patterns to detect fraudulent activity.

/**
 * Stateful fraud detection processor extending KeyedProcessFunction
 * Processes transactions keyed by account ID to detect suspicious patterns
 */
@SerialVersionUID(1L)
class FraudDetector extends KeyedProcessFunction[Long, Transaction, Alert] {
  
  /**
   * Processes each transaction event and emits alerts for suspicious activity
   * @param transaction The input transaction to analyze
   * @param context Processing context providing access to timers and state
   * @param collector Output collector for emitting fraud alerts
   * @throws Exception if processing fails
   */
  @throws[Exception]
  def processElement(
      transaction: Transaction,
      context: KeyedProcessFunction[Long, Transaction, Alert]#Context,
      collector: Collector[Alert]): Unit
}

Fraud Detection Constants

Configuration constants used in fraud detection logic.

object FraudDetector {
  /** Threshold for small transaction amounts */
  val SMALL_AMOUNT: Double = 1.00
  
  /** Threshold for large transaction amounts */
  val LARGE_AMOUNT: Double = 500.00
  
  /** Time constant representing one minute in milliseconds */
  val ONE_MINUTE: Long = 60 * 1000L
}

Generated Project Structure

When instantiated, the archetype creates the following project structure:

my-fraud-detection/
├── pom.xml                          # Maven build configuration with Flink dependencies
└── src/
    └── main/
        ├── scala/
        │   ├── FraudDetectionJob.scala  # Main application entry point
        │   └── FraudDetector.scala      # Fraud detection logic
        └── resources/
            └── log4j2.properties        # Preconfigured logging for Flink applications

Dependencies and Integration

The generated project includes the following key dependencies:

Flink Framework Dependencies

  • flink-streaming-scala: Core Flink streaming API for Scala (provided scope)
  • flink-clients: Flink client libraries (provided scope)
  • flink-walkthrough-common: Common entities and utilities for walkthrough examples

External Entity Types

The archetype integrates with common Flink walkthrough entities:

// From flink-walkthrough-common dependency
import org.apache.flink.walkthrough.common.entity.Transaction
import org.apache.flink.walkthrough.common.entity.Alert
import org.apache.flink.walkthrough.common.source.TransactionSource
import org.apache.flink.walkthrough.common.sink.AlertSink

// From Flink framework
import org.apache.flink.streaming.api.functions.KeyedProcessFunction
import org.apache.flink.util.Collector

Build Configuration

The generated project includes:

  • Maven Shade Plugin: Creates fat JAR with dependencies
  • Scala Maven Plugin: Compiles Scala source code
  • Java 8 Target: Compatible with Flink runtime requirements
  • Main Class Configuration: ${package}.FraudDetectionJob as entry point

Logging Configuration

The archetype includes a preconfigured log4j2.properties file with optimized settings for Flink applications:

rootLogger.level = WARN
rootLogger.appenderRef.console.ref = ConsoleAppender

logger.sink.name = org.apache.flink.walkthrough.common.sink.AlertSink
logger.sink.level = INFO

appender.console.name = ConsoleAppender
appender.console.type = CONSOLE
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{HH:mm:ss,SSS} %-5p %-60c %x - %m%n

This configuration enables INFO-level logging for the AlertSink to display fraud alerts in the console while keeping other components at WARN level to reduce noise.

Usage Examples

Customizing Fraud Detection Logic

Developers can customize the FraudDetector.processElement method to implement specific fraud detection rules:

class FraudDetector extends KeyedProcessFunction[Long, Transaction, Alert] {
  
  def processElement(
      transaction: Transaction,
      context: KeyedProcessFunction[Long, Transaction, Alert]#Context,
      collector: Collector[Alert]): Unit = {
    
    // Custom fraud detection logic
    if (transaction.getAmount > FraudDetector.LARGE_AMOUNT) {
      val alert = new Alert
      alert.setId(transaction.getAccountId)
      alert.setMessage(s"Large transaction detected: ${transaction.getAmount}")
      collector.collect(alert)
    }
  }
}

Running the Generated Application

# Build the project
mvn clean package

# Run locally
mvn exec:java -Dexec.mainClass="com.example.fraud.FraudDetectionJob"

# Or run the fat JAR
java -jar target/my-fraud-detection-1.0-SNAPSHOT.jar

Template Variables

The archetype uses the following Maven template variables that are replaced during project generation:

  • ${package}: Package name for generated classes
  • ${groupId}: Maven group ID for the project
  • ${artifactId}: Maven artifact ID for the project
  • ${version}: Version for the generated project

These variables are automatically substituted when running mvn archetype:generate with the corresponding -D parameters.