CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-apache-flink--flink-ml-2-10

Apache Flink Machine Learning Library for Scala 2.10 - provides distributed machine learning algorithms for scalable data processing on Flink's streaming and batch processing engine

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

Flink ML (flink-ml_2.10)

Apache Flink Machine Learning Library for Scala 2.10. This library provides distributed machine learning capabilities built on top of Apache Flink's stream and batch processing engine. Note: This version (1.3.3) contains a minimal implementation with stub functionality.

Package Information

  • Package Name: flink-ml_2.10
  • Package Type: maven
  • Language: Scala 2.10
  • Installation:
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-ml_2.10</artifactId>
        <version>1.3.3</version>
    </dependency>

Core Imports

import org.apache.flink.ml.MLPackage
import org.apache.flink.ml.regression.MultipleLinearRegression

For Flink execution context:

import org.apache.flink.api.scala._

Basic Usage

import org.apache.flink.api.scala._
import org.apache.flink.ml.MLPackage
import org.apache.flink.ml.regression.MultipleLinearRegression

// Access package information
val version = MLPackage.version
val scalaVersion = MLPackage.scalaVersion

// Create regression model (stub implementation)
val regression = new MultipleLinearRegression()

// Note: The fit method is a stub and returns dummy data
// val model = regression.fit(trainingData)

Capabilities

Package Information

Access version and compatibility information for the ML library.

object MLPackage {
  val version: String
  val scalaVersion: String
}

Multiple Linear Regression

Basic linear regression implementation for distributed machine learning. Note: This is a stub implementation in version 1.3.3.

class MultipleLinearRegression extends Serializable {
  /**
   * Fit the linear regression model
   * 
   * Parameters:
   * - trainingData: DataSet[LabeledVector] - Training dataset with labeled vectors
   * 
   * Returns:
   * DataSet[Array[Double]] - Model coefficients (stub implementation returns Array(0.0))
   */
  def fit(trainingData: DataSet[LabeledVector]): DataSet[Array[Double]]
}

Types

The following types are used in the API:

// DataSet is from Flink's core API (imported via org.apache.flink.api.scala._)
// Represents a distributed dataset in Flink
type DataSet[T] // Flink distributed dataset

// Note: The following types are imported in MultipleLinearRegression.scala 
// but are NOT defined in this stub implementation:
// - org.apache.flink.ml.common.LabeledVector
// - org.apache.flink.ml.common.LinearAlgebra
// 
// These imports exist in the source code but reference non-existent classes,
// indicating this is an incomplete stub implementation.

Implementation Status

Important: This version (1.3.3) appears to be a minimal stub implementation. The actual Flink ML library documentation describes comprehensive machine learning capabilities including:

  • ALS (Alternating Least Squares)
  • SVM using CoCoA
  • k-Nearest Neighbors Join
  • Cross Validation
  • MinMax/Standard Scalers
  • Polynomial Features
  • Stochastic Outlier Selection
  • Distance Metrics
  • Pipelines

However, these features are not present in the actual source code for this version. Only the basic package information and a stub MultipleLinearRegression class are available.

Dependencies

This library depends on:

  • flink-scala_2.10: Core Flink Scala API
  • flink-streaming-scala_2.10: Flink streaming Scala API
  • scala-library: Scala 2.10.6 standard library
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.flink/flink-ml_2.10@1.3.x
Publish Source
CLI
Badge
tessl/maven-org-apache-flink--flink-ml-2-10 badge