Tessl Tile for maven/org.apache.spark/spark-repl_2.11@2.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

class-loading.md index.md interactive-shell.md main-api.md

main-api.mddocs/

0
# Main REPL API
1

2
Core REPL application functionality including entry points, SparkSession management, and signal handling.
3

4
## Capabilities
5

6
### Main Application Entry Point
7

8
The `Main` object serves as the primary entry point for the Spark REPL application and manages the global SparkSession instance.
9

10
```scala { .api }
11
/**
12
 * Main entry point for the Spark REPL application
13
 * @param args Command line arguments for REPL configuration
14
 */
15
def main(args: Array[String]): Unit
16

17
/**
18
 * Creates and configures a SparkSession for the REPL with appropriate defaults
19
 * @returns Configured SparkSession instance with Hive support if available
20
 */
21
def createSparkSession(): SparkSession
22

23
/**
24
 * Internal main method used for testing and custom REPL configurations
25
 * Package-private for testing purposes
26
 * @param args Command line arguments
27
 * @param _interp Custom SparkILoop interpreter instance
28
 */
29
private[repl] def doMain(args: Array[String], _interp: SparkILoop): Unit
30
```
31

32
**Usage Examples:**
33

34
```scala
35
import org.apache.spark.repl.Main
36

37
// Start interactive REPL from command line
38
Main.main(Array.empty)
39

40
// Start REPL with custom arguments
41
Main.main(Array("-classpath", "/path/to/jars"))
42

43
// Create SparkSession programmatically
44
val spark = Main.createSparkSession()
45
println(s"Spark version: ${spark.version}")
46
```
47

48
### Global State Management
49

50
The Main object maintains global state for the REPL session including SparkContext, SparkSession, and interpreter instances.
51

52
```scala { .api }
53
/**
54
 * Current SparkContext instance, created by createSparkSession()
55
 * This is a mutable variable that can be reset for testing
56
 */
57
var sparkContext: SparkContext
58

59
/**
60
 * Current SparkSession instance, created by createSparkSession()  
61
 * This is a mutable variable that can be reset for testing
62
 */
63
var sparkSession: SparkSession
64

65
/**
66
 * Current SparkILoop interpreter instance
67
 * This is a public variable because tests need to reset it
68
 */
69
var interp: SparkILoop
70

71
/**
72
 * Spark configuration instance used for creating SparkContext
73
 * Initialized with default REPL-specific settings
74
 */
75
val conf: SparkConf
76

77
/**
78
 * Output directory for REPL-generated class files
79
 * Created as a temporary directory under spark.repl.classdir or local dir
80
 */
81
val outputDir: File
82
```
83

84
**Usage Examples:**
85

86
```scala
87
import org.apache.spark.repl.Main
88

89
// Access current SparkContext
90
if (Main.sparkContext != null) {
91
  println(s"Master: ${Main.sparkContext.master}")
92
  println(s"App ID: ${Main.sparkContext.applicationId}")
93
}
94

95
// Access SparkSession
96
if (Main.sparkSession != null) {
97
  val df = Main.sparkSession.range(10)
98
  df.show()
99
}
100

101
// Check REPL configuration
102
println(s"Output directory: ${Main.outputDir.getAbsolutePath}")
103
val appName = Main.conf.get("spark.app.name", "Unknown")
104
println(s"App name: ${appName}")
105
```
106

107
### Configuration and Initialization
108

109
The Main object handles REPL-specific configuration including class output directories, executor URIs, and Spark home detection.
110

111
**Configuration Behavior:**
112

113
- Sets `spark.app.name` to "Spark shell" if not specified
114
- Configures `spark.repl.class.outputDir` for class distribution
115
- Detects and sets `SPARK_HOME` from environment variables
116
- Handles `SPARK_EXECUTOR_URI` for custom executor configurations
117
- Enables Hive support automatically if Hive classes are present
118
- Falls back to in-memory catalog if Hive is not available
119

120
**Class Output Directory:**
121

122
The REPL creates a temporary directory for compiled classes that need to be distributed to executors:
123

124
```scala
125
// Directory creation logic
126
val rootDir = conf.getOption("spark.repl.classdir").getOrElse(Utils.getLocalDir(conf))
127
val outputDir = Utils.createTempDir(root = rootDir, namePrefix = "repl")
128
```
129

130
### Signal Handling
131

132
Graceful interrupt handling for canceling running Spark jobs.
133

134
```scala { .api }
135
object Signaling {
136
  /**
137
   * Registers a SIGINT handler that cancels all active Spark jobs
138
   * Allows users to interrupt long-running operations with Ctrl+C
139
   * If no jobs are running, the signal terminates the REPL
140
   */
141
  def cancelOnInterrupt(): Unit
142
}
143
```
144

145
**Usage Examples:**
146

147
```scala
148
import org.apache.spark.repl.Signaling
149

150
// Enable interrupt handling (called automatically by Main)
151
Signaling.cancelOnInterrupt()
152

153
// After this, users can press Ctrl+C to:
154
// 1. Cancel running Spark jobs if any are active
155
// 2. Exit the REPL if no jobs are running
156
```
157

158
**Signal Handling Behavior:**
159

160
1. When Ctrl+C is pressed and Spark jobs are running:
161
   - Displays warning: "Cancelling all active jobs, this can take a while. Press Ctrl+C again to exit now."
162
   - Calls `SparkContext.cancelAllJobs()`
163
   - Returns control to REPL prompt
164

165
2. When Ctrl+C is pressed and no jobs are running:
166
   - Terminates the REPL session immediately
167

168
## Error Handling
169

170
The Main object includes error handling for common REPL initialization scenarios:
171

172
- **Scala Option Errors**: Command line argument parsing errors are displayed to stderr
173
- **SparkSession Creation Failures**: In shell sessions, initialization errors cause `sys.exit(1)`
174
- **Non-shell Sessions**: Exceptions are propagated to the caller for custom handling
175

176
## Thread Safety Notes
177

178
- Global variables (`sparkContext`, `sparkSession`, `interp`) are not thread-safe
179
- These variables are designed for single-threaded REPL usage
180
- Tests may reset these variables between test cases
181
- The `conf` and `outputDir` values are immutable after initialization

Version

Tile

Files

main-api.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

main-api.mddocs/