tessl/maven-org-apache-spark--spark-repl-2-11

Interactive Scala shell for Apache Spark with distributed computing capabilities

—

Pending

Overview

Eval results

Files

REPL Entry Point

Name: tessl/maven-org-apache-spark--spark-repl-2-11
Author: tessl

Main application entry point and SparkSession/SparkContext lifecycle management for the interactive shell.

Capabilities

Main Object

Entry point object that manages the REPL application lifecycle and Spark session creation.

/**
 * Main entry point for the Spark REPL application
 * Manages SparkContext, SparkSession, and SparkILoop instances
 */
object Main extends Logging {
  /** Spark configuration object */
  val conf: SparkConf
  
  /** Current Spark context (mutable, null initially) */
  var sparkContext: SparkContext
  
  /** Current Spark session (mutable, null initially) */  
  var sparkSession: SparkSession
  
  /** Current interpreter instance (mutable, used by tests) */
  var interp: SparkILoop
  
  /**
   * Main application entry point
   * @param args Command line arguments passed to the REPL
   */
  def main(args: Array[String]): Unit
  
  /**
   * Creates and configures a new SparkSession with proper catalog support
   * Automatically determines Hive support based on classpath availability
   * @return Configured SparkSession instance
   */
  def createSparkSession(): SparkSession
}

Usage Examples:

// Start REPL programmatically
org.apache.spark.repl.Main.main(Array("-classpath", "/path/to/jars"))

// Access current session
val session = org.apache.spark.repl.Main.sparkSession
val context = org.apache.spark.repl.Main.sparkContext

// Create new session
val newSession = org.apache.spark.repl.Main.createSparkSession()

Internal Main Method

Internal main method used for testing and custom REPL initialization.

/**
 * Internal main method used by tests and custom initialization
 * @param args Command line arguments
 * @param _interp Custom SparkILoop instance to use
 */
private[repl] def doMain(args: Array[String], _interp: SparkILoop): Unit

Configuration

Spark Configuration

The Main object uses a pre-configured SparkConf instance with REPL-specific settings:

spark.repl.classdir: Directory for REPL class files
spark.repl.class.outputDir: Output directory for compiled classes
spark.app.name: Default application name ("Spark shell")
spark.executor.uri: Executor URI from environment
spark.home: Spark home directory from environment

Class File Management

REPL manages temporary directories for dynamically compiled classes:

val rootDir = conf.getOption("spark.repl.classdir").getOrElse(Utils.getLocalDir(conf))
val outputDir = Utils.createTempDir(root = rootDir, namePrefix = "repl")

Catalog Integration

Automatic Hive catalog support based on classpath:

If Hive classes present and spark.sql.catalogImplementation=hive: Enables Hive support
Otherwise: Uses in-memory catalog
Configurable via spark.sql.catalogImplementation property

Error Handling

Initialization Errors

// Shell session initialization failure
case e: Exception if isShellSession =>
  logError("Failed to initialize Spark session.", e)
  sys.exit(1)

Argument Processing Errors

Invalid Scala compiler arguments are handled via error callback:

private def scalaOptionError(msg: String): Unit = {
  hasErrors = true
  Console.err.println(msg)
}

Environment Integration

Environment Variables

SPARK_EXECUTOR_URI: Sets executor URI for distributed execution
SPARK_HOME: Sets Spark installation directory

Command Line Integration

Arguments are processed and passed to Scala interpreter:

val interpArguments = List(
  "-Yrepl-class-based",
  "-Yrepl-outdir", s"${outputDir.getAbsolutePath}",
  "-classpath", jars
) ++ args.toList

Signal Handling Integration

Automatically registers interrupt signal handling:

Signaling.cancelOnInterrupt()

Install with Tessl CLI

npx tessl i tessl/maven-org-apache-spark--spark-repl-2-11

docs

scala-compatibility.md

signal-handling.md

tile.json