or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/maven-org-apache-flink--flink-ml_2-10

Apache Flink Machine Learning Library for Scala 2.10 - provides distributed machine learning algorithms for scalable data processing on Flink's streaming and batch processing engine

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-ml_2.10@1.3.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-flink--flink-ml_2-10@1.3.0

index.mddocs/

Flink ML (flink-ml_2.10)

Apache Flink Machine Learning Library for Scala 2.10. This library provides distributed machine learning capabilities built on top of Apache Flink's stream and batch processing engine. Note: This version (1.3.3) contains a minimal implementation with stub functionality.

Package Information

  • Package Name: flink-ml_2.10
  • Package Type: maven
  • Language: Scala 2.10
  • Installation:
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-ml_2.10</artifactId>
        <version>1.3.3</version>
    </dependency>

Core Imports

import org.apache.flink.ml.MLPackage
import org.apache.flink.ml.regression.MultipleLinearRegression

For Flink execution context:

import org.apache.flink.api.scala._

Basic Usage

import org.apache.flink.api.scala._
import org.apache.flink.ml.MLPackage
import org.apache.flink.ml.regression.MultipleLinearRegression

// Access package information
val version = MLPackage.version
val scalaVersion = MLPackage.scalaVersion

// Create regression model (stub implementation)
val regression = new MultipleLinearRegression()

// Note: The fit method is a stub and returns dummy data
// val model = regression.fit(trainingData)

Capabilities

Package Information

Access version and compatibility information for the ML library.

object MLPackage {
  val version: String
  val scalaVersion: String
}

Multiple Linear Regression

Basic linear regression implementation for distributed machine learning. Note: This is a stub implementation in version 1.3.3.

class MultipleLinearRegression extends Serializable {
  /**
   * Fit the linear regression model
   * 
   * Parameters:
   * - trainingData: DataSet[LabeledVector] - Training dataset with labeled vectors
   * 
   * Returns:
   * DataSet[Array[Double]] - Model coefficients (stub implementation returns Array(0.0))
   */
  def fit(trainingData: DataSet[LabeledVector]): DataSet[Array[Double]]
}

Types

The following types are used in the API:

// DataSet is from Flink's core API (imported via org.apache.flink.api.scala._)
// Represents a distributed dataset in Flink
type DataSet[T] // Flink distributed dataset

// Note: The following types are imported in MultipleLinearRegression.scala 
// but are NOT defined in this stub implementation:
// - org.apache.flink.ml.common.LabeledVector
// - org.apache.flink.ml.common.LinearAlgebra
// 
// These imports exist in the source code but reference non-existent classes,
// indicating this is an incomplete stub implementation.

Implementation Status

Important: This version (1.3.3) appears to be a minimal stub implementation. The actual Flink ML library documentation describes comprehensive machine learning capabilities including:

  • ALS (Alternating Least Squares)
  • SVM using CoCoA
  • k-Nearest Neighbors Join
  • Cross Validation
  • MinMax/Standard Scalers
  • Polynomial Features
  • Stochastic Outlier Selection
  • Distance Metrics
  • Pipelines

However, these features are not present in the actual source code for this version. Only the basic package information and a stub MultipleLinearRegression class are available.

Dependencies

This library depends on:

  • flink-scala_2.10: Core Flink Scala API
  • flink-streaming-scala_2.10: Flink streaming Scala API
  • scala-library: Scala 2.10.6 standard library