or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# Flink ML (flink-ml_2.10)

1

2

Apache Flink Machine Learning Library for Scala 2.10. This library provides distributed machine learning capabilities built on top of Apache Flink's stream and batch processing engine. Note: This version (1.3.3) contains a minimal implementation with stub functionality.

3

4

## Package Information

5

6

- **Package Name**: flink-ml_2.10

7

- **Package Type**: maven

8

- **Language**: Scala 2.10

9

- **Installation**:

10

```xml

11

<dependency>

12

<groupId>org.apache.flink</groupId>

13

<artifactId>flink-ml_2.10</artifactId>

14

<version>1.3.3</version>

15

</dependency>

16

```

17

18

## Core Imports

19

20

```scala

21

import org.apache.flink.ml.MLPackage

22

import org.apache.flink.ml.regression.MultipleLinearRegression

23

```

24

25

For Flink execution context:

26

```scala

27

import org.apache.flink.api.scala._

28

```

29

30

## Basic Usage

31

32

```scala

33

import org.apache.flink.api.scala._

34

import org.apache.flink.ml.MLPackage

35

import org.apache.flink.ml.regression.MultipleLinearRegression

36

37

// Access package information

38

val version = MLPackage.version

39

val scalaVersion = MLPackage.scalaVersion

40

41

// Create regression model (stub implementation)

42

val regression = new MultipleLinearRegression()

43

44

// Note: The fit method is a stub and returns dummy data

45

// val model = regression.fit(trainingData)

46

```

47

48

## Capabilities

49

50

### Package Information

51

52

Access version and compatibility information for the ML library.

53

54

```scala { .api }

55

object MLPackage {

56

val version: String

57

val scalaVersion: String

58

}

59

```

60

61

### Multiple Linear Regression

62

63

Basic linear regression implementation for distributed machine learning. Note: This is a stub implementation in version 1.3.3.

64

65

```scala { .api }

66

class MultipleLinearRegression extends Serializable {

67

/**

68

* Fit the linear regression model

69

*

70

* Parameters:

71

* - trainingData: DataSet[LabeledVector] - Training dataset with labeled vectors

72

*

73

* Returns:

74

* DataSet[Array[Double]] - Model coefficients (stub implementation returns Array(0.0))

75

*/

76

def fit(trainingData: DataSet[LabeledVector]): DataSet[Array[Double]]

77

}

78

```

79

80

## Types

81

82

The following types are used in the API:

83

84

```scala { .api }

85

// DataSet is from Flink's core API (imported via org.apache.flink.api.scala._)

86

// Represents a distributed dataset in Flink

87

type DataSet[T] // Flink distributed dataset

88

89

// Note: The following types are imported in MultipleLinearRegression.scala

90

// but are NOT defined in this stub implementation:

91

// - org.apache.flink.ml.common.LabeledVector

92

// - org.apache.flink.ml.common.LinearAlgebra

93

//

94

// These imports exist in the source code but reference non-existent classes,

95

// indicating this is an incomplete stub implementation.

96

```

97

98

## Implementation Status

99

100

**Important**: This version (1.3.3) appears to be a minimal stub implementation. The actual Flink ML library documentation describes comprehensive machine learning capabilities including:

101

102

- ALS (Alternating Least Squares)

103

- SVM using CoCoA

104

- k-Nearest Neighbors Join

105

- Cross Validation

106

- MinMax/Standard Scalers

107

- Polynomial Features

108

- Stochastic Outlier Selection

109

- Distance Metrics

110

- Pipelines

111

112

However, these features are not present in the actual source code for this version. Only the basic package information and a stub MultipleLinearRegression class are available.

113

114

## Dependencies

115

116

This library depends on:

117

- `flink-scala_2.10`: Core Flink Scala API

118

- `flink-streaming-scala_2.10`: Flink streaming Scala API

119

- `scala-library`: Scala 2.10.6 standard library