or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

auto-completion.mdcode-interpreter.mddistributed-class-loading.mdindex.mdinteractive-shell.md
tile.json

tessl/maven-spark-repl-2-10

Apache Spark interactive Scala shell (REPL) for Scala 2.10 providing read-eval-print loop interface for interactive data analysis and cluster computing.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-repl_2.10@2.2.x

To install, run

npx @tessl/cli install tessl/maven-spark-repl-2-10@2.2.0

index.mddocs/

Apache Spark REPL (Scala 2.10)

Apache Spark REPL provides an interactive Scala shell environment for distributed data processing and cluster computing. It offers a read-eval-print loop interface that enables users to interactively execute Scala code against Spark clusters, providing real-time data analysis capabilities with automatic SparkContext and SparkSession initialization.

Package Information

  • Package Name: spark-repl_2.10
  • Package Type: maven
  • Language: Scala
  • Installation: mvn dependency:get -Dartifact=org.apache.spark:spark-repl_2.10:2.2.3

Core Imports

import org.apache.spark.repl.{Main, SparkILoop, SparkIMain}
import org.apache.spark.repl.SparkJLineCompletion
import org.apache.spark.repl.{ExecutorClassLoader, Signaling}
import scala.tools.nsc.interpreter.JPrintWriter
import scala.tools.nsc.Settings
import java.io.BufferedReader

Basic Usage

Starting the REPL programmatically

import org.apache.spark.repl.{Main, SparkILoop}

// Start REPL with default settings
Main.main(Array())

// Or create custom REPL instance
val repl = new SparkILoop()
repl.process(Array("-master", "local[*]"))

Using the interpreter directly

import org.apache.spark.repl.SparkIMain
import scala.tools.nsc.interpreter.{Results => IR}

val interpreter = new SparkIMain()
interpreter.initializeSynchronous()

// Execute Scala code
val result = interpreter.interpret("val x = 1 + 1")
val value = interpreter.valueOfTerm("x")

Architecture

The Spark REPL is built around several key components:

  • Main Entry Point: Main object provides the application entry point and global interpreter access
  • Interactive Loop: SparkILoop class manages the read-eval-print cycle with Spark-specific features
  • Interpreter Core: SparkIMain class handles code compilation, execution, and state management
  • Completion System: SparkJLineCompletion provides auto-completion functionality using JLine
  • Class Loading: ExecutorClassLoader enables loading REPL-defined classes on Spark executors
  • Signal Handling: Signaling utilities for proper interrupt handling and job cancellation

Capabilities

Interactive Shell

Main REPL functionality providing an interactive shell with Spark integration, command processing, and session management.

object Main {
  def main(args: Array[String]): Unit
  def interp: SparkILoop
  def interp_=(i: SparkILoop): Unit
}

@DeveloperApi
class SparkILoop(
  in0: Option[BufferedReader] = None,
  out: JPrintWriter = new JPrintWriter(Console.out, true),
  master: Option[String] = None
) {
  def process(args: Array[String]): Boolean
  def createSparkSession(): SparkSession
  var sparkContext: SparkContext
}

Interactive Shell

Code Interpreter

Core interpreter providing code compilation, execution, variable management, and introspection capabilities for interactive Scala code evaluation.

@DeveloperApi
class SparkIMain(
  initialSettings: Settings,
  out: JPrintWriter,
  propagateExceptions: Boolean = false
) {
  def initializeSynchronous(): Unit
  def interpret(line: String): IR.Result
  def compileSources(sources: SourceFile*): Boolean
  def compileString(code: String): Boolean
}

Code Interpreter

Auto-Completion

Interactive code completion functionality providing context-aware suggestions and symbol completion using JLine integration.

@DeveloperApi
class SparkJLineCompletion(val intp: SparkIMain) {
  var verbosity: Int
  def resetVerbosity(): Unit
  def completer(): JLineTabCompletion
}

Auto-Completion

Distributed Class Loading

Class loading system for distributing REPL-compiled classes to Spark executors across the cluster from various sources including HTTP, Hadoop FS, and Spark RPC.

class ExecutorClassLoader(
  conf: SparkConf,
  env: SparkEnv,
  classUri: String,
  parent: ClassLoader,
  userClassPathFirst: Boolean
) extends ClassLoader {
  override def findClass(name: String): Class[_]
  def readAndTransformClass(name: String, in: InputStream): Array[Byte]
}

Distributed Class Loading

Types

// Result enumeration for code interpretation
object IR {
  sealed abstract class Result
  case object Success extends Result
  case object Error extends Result  
  case object Incomplete extends Result
}

// Command line settings
@DeveloperApi
class SparkRunnerSettings(error: String => Unit) extends Settings {
  val loadfiles: MultiStringSetting
}

// Signal handling utilities
object Signaling {
  def cancelOnInterrupt(): Unit
}