CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-apache-spark--spark-repl-2-10

Apache Spark interactive Scala shell (REPL) for Scala 2.10 providing read-eval-print loop interface for interactive data analysis and cluster computing.

Pending
Overview
Eval results
Files

Apache Spark REPL (Scala 2.10)

Apache Spark REPL provides an interactive Scala shell environment for distributed data processing and cluster computing. It offers a read-eval-print loop interface that enables users to interactively execute Scala code against Spark clusters, providing real-time data analysis capabilities with automatic SparkContext and SparkSession initialization.

Package Information

  • Package Name: spark-repl_2.10
  • Package Type: maven
  • Language: Scala
  • Installation: mvn dependency:get -Dartifact=org.apache.spark:spark-repl_2.10:2.2.3

Core Imports

import org.apache.spark.repl.{Main, SparkILoop, SparkIMain}
import org.apache.spark.repl.SparkJLineCompletion
import org.apache.spark.repl.{ExecutorClassLoader, Signaling}
import scala.tools.nsc.interpreter.JPrintWriter
import scala.tools.nsc.Settings
import java.io.BufferedReader

Basic Usage

Starting the REPL programmatically

import org.apache.spark.repl.{Main, SparkILoop}

// Start REPL with default settings
Main.main(Array())

// Or create custom REPL instance
val repl = new SparkILoop()
repl.process(Array("-master", "local[*]"))

Using the interpreter directly

import org.apache.spark.repl.SparkIMain
import scala.tools.nsc.interpreter.{Results => IR}

val interpreter = new SparkIMain()
interpreter.initializeSynchronous()

// Execute Scala code
val result = interpreter.interpret("val x = 1 + 1")
val value = interpreter.valueOfTerm("x")

Architecture

The Spark REPL is built around several key components:

  • Main Entry Point: Main object provides the application entry point and global interpreter access
  • Interactive Loop: SparkILoop class manages the read-eval-print cycle with Spark-specific features
  • Interpreter Core: SparkIMain class handles code compilation, execution, and state management
  • Completion System: SparkJLineCompletion provides auto-completion functionality using JLine
  • Class Loading: ExecutorClassLoader enables loading REPL-defined classes on Spark executors
  • Signal Handling: Signaling utilities for proper interrupt handling and job cancellation

Capabilities

Interactive Shell

Main REPL functionality providing an interactive shell with Spark integration, command processing, and session management.

object Main {
  def main(args: Array[String]): Unit
  def interp: SparkILoop
  def interp_=(i: SparkILoop): Unit
}

@DeveloperApi
class SparkILoop(
  in0: Option[BufferedReader] = None,
  out: JPrintWriter = new JPrintWriter(Console.out, true),
  master: Option[String] = None
) {
  def process(args: Array[String]): Boolean
  def createSparkSession(): SparkSession
  var sparkContext: SparkContext
}

Interactive Shell

Code Interpreter

Core interpreter providing code compilation, execution, variable management, and introspection capabilities for interactive Scala code evaluation.

@DeveloperApi
class SparkIMain(
  initialSettings: Settings,
  out: JPrintWriter,
  propagateExceptions: Boolean = false
) {
  def initializeSynchronous(): Unit
  def interpret(line: String): IR.Result
  def compileSources(sources: SourceFile*): Boolean
  def compileString(code: String): Boolean
}

Code Interpreter

Auto-Completion

Interactive code completion functionality providing context-aware suggestions and symbol completion using JLine integration.

@DeveloperApi
class SparkJLineCompletion(val intp: SparkIMain) {
  var verbosity: Int
  def resetVerbosity(): Unit
  def completer(): JLineTabCompletion
}

Auto-Completion

Distributed Class Loading

Class loading system for distributing REPL-compiled classes to Spark executors across the cluster from various sources including HTTP, Hadoop FS, and Spark RPC.

class ExecutorClassLoader(
  conf: SparkConf,
  env: SparkEnv,
  classUri: String,
  parent: ClassLoader,
  userClassPathFirst: Boolean
) extends ClassLoader {
  override def findClass(name: String): Class[_]
  def readAndTransformClass(name: String, in: InputStream): Array[Byte]
}

Distributed Class Loading

Types

// Result enumeration for code interpretation
object IR {
  sealed abstract class Result
  case object Success extends Result
  case object Error extends Result  
  case object Incomplete extends Result
}

// Command line settings
@DeveloperApi
class SparkRunnerSettings(error: String => Unit) extends Settings {
  val loadfiles: MultiStringSetting
}

// Signal handling utilities
object Signaling {
  def cancelOnInterrupt(): Unit
}

Install with Tessl CLI

npx tessl i tessl/maven-org-apache-spark--spark-repl-2-10
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-repl_2.10@2.2.x
Badge
tessl/maven-org-apache-spark--spark-repl-2-10 badge