Interactive Scala shell for Apache Spark with distributed computing capabilities
—
Core interactive shell functionality with Spark-specific initialization, commands, and REPL processing.
Spark-specific interactive shell loop that extends Scala's standard ILoop with Spark initialization and custom behavior.
/**
* A Spark-specific interactive shell extending Scala's ILoop
* Provides automatic Spark context/session creation and initialization
*/
class SparkILoop(in0: Option[BufferedReader], out: JPrintWriter) extends ILoop(in0, out) {
/**
* Alternative constructor with BufferedReader
* @param in0 Input reader for REPL commands
* @param out Output writer for REPL responses
*/
def this(in0: BufferedReader, out: JPrintWriter)
/**
* Default constructor using console I/O
*/
def this()
/**
* Initialize Spark context and session in the REPL environment
* Executes initialization commands to create 'spark' and 'sc' variables
* Imports common Spark APIs automatically
*/
def initializeSpark(): Unit
/**
* Main REPL processing loop
* Handles startup, interpreter creation, and command processing
* @param settings Scala compiler settings
* @return true if processing completed successfully
*/
def process(settings: Settings): Boolean
/**
* Create the Scala interpreter with Spark-specific customizations
* Uses SparkILoopInterpreter for Scala 2.11 compatibility
*/
override def createInterpreter(): Unit
/** Print Spark welcome message with version info */
override def printWelcome(): Unit
/** Available REPL commands (uses standard commands) */
override def commands: List[LoopCommand]
/**
* Handle :reset command
* Preserves SparkSession and SparkContext state after reset
* @param line Command line input
*/
override def resetCommand(line: String): Unit
/** Replay command history with Spark re-initialization */
override def replay(): Unit
}Usage Examples:
import org.apache.spark.repl.SparkILoop
import java.io.{BufferedReader, StringReader, PrintWriter, StringWriter}
// Create REPL with custom I/O
val input = new BufferedReader(new StringReader("val data = sc.parallelize(1 to 10)\ndata.sum()"))
val output = new StringWriter()
val repl = new SparkILoop(input, new PrintWriter(output))
// Process with default settings
import scala.tools.nsc.Settings
val settings = new Settings
repl.process(settings)
// Access output
val result = output.toStringUtility methods for running code in REPL instances programmatically.
object SparkILoop {
/**
* Creates an interpreter loop with default settings and feeds
* the given code to it as input
* @param code Scala code to execute
* @param sets Scala compiler settings (optional)
* @return String output from REPL execution
*/
def run(code: String, sets: Settings = new Settings): String
/**
* Run multiple lines of code in REPL
* @param lines List of code lines to execute
* @return String output from REPL execution
*/
def run(lines: List[String]): String
}Usage Examples:
// Execute single code block
val result = SparkILoop.run("""
val rdd = sc.parallelize(1 to 100)
rdd.filter(_ % 2 == 0).count()
""")
// Execute multiple lines
val lines = List(
"val data = sc.parallelize(1 to 10)",
"val doubled = data.map(_ * 2)",
"doubled.collect()"
)
val output = SparkILoop.run(lines)Pre-defined commands executed during REPL startup to set up Spark environment.
/**
* Commands run automatically during REPL initialization
* Creates 'spark' and 'sc' variables and imports common APIs
*/
val initializationCommands: Seq[String]The initialization commands include:
spark variablesc variable// Actual initialization commands:
"""
@transient val spark = if (org.apache.spark.repl.Main.sparkSession != null) {
org.apache.spark.repl.Main.sparkSession
} else {
org.apache.spark.repl.Main.createSparkSession()
}
@transient val sc = {
val _sc = spark.sparkContext
// UI URL display logic
_sc
}
"""
"import org.apache.spark.SparkContext._"
"import spark.implicits._"
"import spark.sql"
"import org.apache.spark.sql.functions._"Custom Spark ASCII art welcome message with version information:
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.4.8
/_/
Using Scala 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_275)
Type in expressions to have them evaluated.
Type :help for more information.Special handling for Scala 2.11 compatibility issues:
SparkILoopInterpreter for Scala 2.11 to fix import handling bugsEnhanced command processing with Spark-specific features:
if (!intp.reporter.hasErrors) {
// Proceed with initialization
} else {
throw new RuntimeException(s"Scala $versionString interpreter encountered errors during initialization")
}Special handling for Scala 2.11 classloader bugs:
private def runClosure(body: => Boolean): Boolean = {
if (isScala2_11) {
val original = Thread.currentThread().getContextClassLoader
try {
body
} finally {
Thread.currentThread().setContextClassLoader(original)
}
} else {
body
}
}Automatic import of commonly used Spark APIs:
SparkContext._: RDD operations and implicitsspark.implicits._: Dataset/DataFrame encodersspark.sql: SQL interface accessorg.apache.spark.sql.functions._: SQL functionsAutomatic display of Spark UI information:
Support for loading Scala files during startup:
:load command support for script files:paste command support for code blocksInstall with Tessl CLI
npx tessl i tessl/maven-org-apache-spark--spark-repl-2-11