or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-spark--spark-repl-2-11

Interactive Scala shell for Apache Spark with distributed computing capabilities

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-repl_2.11@2.4.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-repl-2-11@2.4.0

0

# Apache Spark REPL

1

2

Apache Spark REPL is an interactive Scala shell that provides a command-line interface for Apache Spark. It allows users to interactively execute Spark code, explore data, run SQL queries, and perform distributed computing operations in real-time. The REPL extends the standard Scala interpreter with Spark-specific functionality, automatically creating a SparkSession and SparkContext, and providing seamless access to Spark's core APIs including RDDs, DataFrames, and Datasets.

3

4

## Package Information

5

6

- **Package Name**: spark-repl_2.11

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Installation**: `<dependency><groupId>org.apache.spark</groupId><artifactId>spark-repl_2.11</artifactId><version>2.4.8</version></dependency>`

10

11

## Core Imports

12

13

```scala

14

import org.apache.spark.repl._

15

```

16

17

For main entry point:

18

```scala

19

import org.apache.spark.repl.Main

20

```

21

22

For interactive loop:

23

```scala

24

import org.apache.spark.repl.SparkILoop

25

```

26

27

For custom class loading:

28

```scala

29

import org.apache.spark.repl.ExecutorClassLoader

30

```

31

32

## Basic Usage

33

34

### Command Line Usage

35

36

```bash

37

# Start Spark REPL

38

spark-shell

39

40

# Or via main class

41

scala -cp <spark-classpath> org.apache.spark.repl.Main

42

```

43

44

### Programmatic Usage

45

46

```scala

47

import org.apache.spark.repl.SparkILoop

48

import scala.tools.nsc.Settings

49

50

// Execute code in REPL

51

val code = """

52

val data = sc.parallelize(1 to 10)

53

data.sum()

54

"""

55

val result = SparkILoop.run(code)

56

57

// Create custom REPL instance

58

val settings = new Settings

59

val repl = new SparkILoop()

60

repl.process(settings)

61

```

62

63

## Architecture

64

65

Apache Spark REPL is built around several key components:

66

67

- **Main Entry Point**: The `Main` object provides application entry point and SparkSession/SparkContext creation

68

- **Interactive Shell**: `SparkILoop` extends Scala's standard REPL with Spark-specific initialization and commands

69

- **Distributed Class Loading**: `ExecutorClassLoader` enables loading of REPL-compiled classes on remote executors

70

- **Signal Handling**: Integration with Spark's job cancellation system for interactive interruption

71

- **Scala Version Support**: Special handling for Scala 2.11 compatibility issues with imports and type inference

72

73

## Capabilities

74

75

### REPL Entry Point and Session Management

76

77

Main application entry point and SparkSession/SparkContext lifecycle management for the interactive shell.

78

79

```scala { .api }

80

object Main extends Logging {

81

var sparkContext: SparkContext

82

var sparkSession: SparkSession

83

var interp: SparkILoop

84

val conf: SparkConf

85

86

def main(args: Array[String]): Unit

87

def createSparkSession(): SparkSession

88

}

89

```

90

91

[REPL Entry Point](./main-entry.md)

92

93

### Interactive Shell Loop

94

95

Core interactive shell functionality with Spark-specific initialization, commands, and REPL processing.

96

97

```scala { .api }

98

class SparkILoop(in0: Option[BufferedReader], out: JPrintWriter) extends ILoop {

99

def this(in0: BufferedReader, out: JPrintWriter)

100

def this()

101

102

def initializeSpark(): Unit

103

def process(settings: Settings): Boolean

104

override def createInterpreter(): Unit

105

override def printWelcome(): Unit

106

override def commands: List[LoopCommand]

107

override def resetCommand(line: String): Unit

108

override def replay(): Unit

109

}

110

111

object SparkILoop {

112

def run(code: String, sets: Settings = new Settings): String

113

def run(lines: List[String]): String

114

}

115

```

116

117

[Interactive Shell](./interactive-shell.md)

118

119

### Distributed Class Loading

120

121

Custom class loader system for loading REPL-compiled classes on remote Spark executors with support for RPC and Hadoop filesystem access.

122

123

```scala { .api }

124

class ExecutorClassLoader(

125

conf: SparkConf,

126

env: SparkEnv,

127

classUri: String,

128

parent: ClassLoader,

129

userClassPathFirst: Boolean

130

) extends ClassLoader with Logging {

131

132

override def findClass(name: String): Class[_]

133

def findClassLocally(name: String): Option[Class[_]]

134

def readAndTransformClass(name: String, in: InputStream): Array[Byte]

135

def urlEncode(str: String): String

136

override def getResource(name: String): URL

137

override def getResources(name: String): java.util.Enumeration[URL]

138

override def getResourceAsStream(name: String): InputStream

139

}

140

```

141

142

[Distributed Class Loading](./class-loading.md)

143

144

### Signal Handling

145

146

Signal handling utilities for interactive job cancellation and REPL interrupt management.

147

148

```scala { .api }

149

object Signaling extends Logging {

150

def cancelOnInterrupt(): Unit

151

}

152

```

153

154

[Signal Handling](./signal-handling.md)

155

156

### Scala 2.11 Compatibility Components

157

158

Specialized interpreter and expression typing components for Scala 2.11 compatibility fixes.

159

160

```scala { .api }

161

class SparkILoopInterpreter(settings: Settings, out: JPrintWriter) extends IMain {

162

def symbolOfLine(code: String): global.Symbol

163

def typeOfExpression(expr: String, silent: Boolean): global.Type

164

def importsCode(wanted: Set[Name], wrapper: Request#Wrapper,

165

definesClass: Boolean, generousImports: Boolean): ComputedImports

166

}

167

168

trait SparkExprTyper extends ExprTyper {

169

def doInterpret(code: String): IR.Result

170

def symbolOfLine(code: String): Symbol

171

}

172

```

173

174

[Scala 2.11 Compatibility](./scala-compatibility.md)

175

176

## Types

177

178

### Core Configuration Types

179

180

```scala { .api }

181

// From Spark Core

182

case class SparkConf()

183

class SparkContext

184

class SparkSession

185

class SparkEnv

186

187

// From Scala

188

class Settings extends scala.tools.nsc.Settings

189

class BufferedReader extends java.io.BufferedReader

190

class JPrintWriter extends scala.tools.nsc.interpreter.JPrintWriter

191

```

192

193

### REPL-Specific Types

194

195

```scala { .api }

196

// REPL interpreter result types

197

object IR {

198

sealed abstract class Result

199

case object Success extends Result

200

case class Error(exception: Throwable) extends Result

201

case object Incomplete extends Result

202

}

203

204

// Class loading types

205

trait ClassLoader extends java.lang.ClassLoader

206

trait Logging

207

```