or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

class-loading.mdindex.mdinteractive-shell.mdmain-api.md

main-api.mddocs/

0

# Main REPL API

1

2

Core REPL application functionality including entry points, SparkSession management, and signal handling.

3

4

## Capabilities

5

6

### Main Application Entry Point

7

8

The `Main` object serves as the primary entry point for the Spark REPL application and manages the global SparkSession instance.

9

10

```scala { .api }

11

/**

12

* Main entry point for the Spark REPL application

13

* @param args Command line arguments for REPL configuration

14

*/

15

def main(args: Array[String]): Unit

16

17

/**

18

* Creates and configures a SparkSession for the REPL with appropriate defaults

19

* @returns Configured SparkSession instance with Hive support if available

20

*/

21

def createSparkSession(): SparkSession

22

23

/**

24

* Internal main method used for testing and custom REPL configurations

25

* Package-private for testing purposes

26

* @param args Command line arguments

27

* @param _interp Custom SparkILoop interpreter instance

28

*/

29

private[repl] def doMain(args: Array[String], _interp: SparkILoop): Unit

30

```

31

32

**Usage Examples:**

33

34

```scala

35

import org.apache.spark.repl.Main

36

37

// Start interactive REPL from command line

38

Main.main(Array.empty)

39

40

// Start REPL with custom arguments

41

Main.main(Array("-classpath", "/path/to/jars"))

42

43

// Create SparkSession programmatically

44

val spark = Main.createSparkSession()

45

println(s"Spark version: ${spark.version}")

46

```

47

48

### Global State Management

49

50

The Main object maintains global state for the REPL session including SparkContext, SparkSession, and interpreter instances.

51

52

```scala { .api }

53

/**

54

* Current SparkContext instance, created by createSparkSession()

55

* This is a mutable variable that can be reset for testing

56

*/

57

var sparkContext: SparkContext

58

59

/**

60

* Current SparkSession instance, created by createSparkSession()

61

* This is a mutable variable that can be reset for testing

62

*/

63

var sparkSession: SparkSession

64

65

/**

66

* Current SparkILoop interpreter instance

67

* This is a public variable because tests need to reset it

68

*/

69

var interp: SparkILoop

70

71

/**

72

* Spark configuration instance used for creating SparkContext

73

* Initialized with default REPL-specific settings

74

*/

75

val conf: SparkConf

76

77

/**

78

* Output directory for REPL-generated class files

79

* Created as a temporary directory under spark.repl.classdir or local dir

80

*/

81

val outputDir: File

82

```

83

84

**Usage Examples:**

85

86

```scala

87

import org.apache.spark.repl.Main

88

89

// Access current SparkContext

90

if (Main.sparkContext != null) {

91

println(s"Master: ${Main.sparkContext.master}")

92

println(s"App ID: ${Main.sparkContext.applicationId}")

93

}

94

95

// Access SparkSession

96

if (Main.sparkSession != null) {

97

val df = Main.sparkSession.range(10)

98

df.show()

99

}

100

101

// Check REPL configuration

102

println(s"Output directory: ${Main.outputDir.getAbsolutePath}")

103

val appName = Main.conf.get("spark.app.name", "Unknown")

104

println(s"App name: ${appName}")

105

```

106

107

### Configuration and Initialization

108

109

The Main object handles REPL-specific configuration including class output directories, executor URIs, and Spark home detection.

110

111

**Configuration Behavior:**

112

113

- Sets `spark.app.name` to "Spark shell" if not specified

114

- Configures `spark.repl.class.outputDir` for class distribution

115

- Detects and sets `SPARK_HOME` from environment variables

116

- Handles `SPARK_EXECUTOR_URI` for custom executor configurations

117

- Enables Hive support automatically if Hive classes are present

118

- Falls back to in-memory catalog if Hive is not available

119

120

**Class Output Directory:**

121

122

The REPL creates a temporary directory for compiled classes that need to be distributed to executors:

123

124

```scala

125

// Directory creation logic

126

val rootDir = conf.getOption("spark.repl.classdir").getOrElse(Utils.getLocalDir(conf))

127

val outputDir = Utils.createTempDir(root = rootDir, namePrefix = "repl")

128

```

129

130

### Signal Handling

131

132

Graceful interrupt handling for canceling running Spark jobs.

133

134

```scala { .api }

135

object Signaling {

136

/**

137

* Registers a SIGINT handler that cancels all active Spark jobs

138

* Allows users to interrupt long-running operations with Ctrl+C

139

* If no jobs are running, the signal terminates the REPL

140

*/

141

def cancelOnInterrupt(): Unit

142

}

143

```

144

145

**Usage Examples:**

146

147

```scala

148

import org.apache.spark.repl.Signaling

149

150

// Enable interrupt handling (called automatically by Main)

151

Signaling.cancelOnInterrupt()

152

153

// After this, users can press Ctrl+C to:

154

// 1. Cancel running Spark jobs if any are active

155

// 2. Exit the REPL if no jobs are running

156

```

157

158

**Signal Handling Behavior:**

159

160

1. When Ctrl+C is pressed and Spark jobs are running:

161

- Displays warning: "Cancelling all active jobs, this can take a while. Press Ctrl+C again to exit now."

162

- Calls `SparkContext.cancelAllJobs()`

163

- Returns control to REPL prompt

164

165

2. When Ctrl+C is pressed and no jobs are running:

166

- Terminates the REPL session immediately

167

168

## Error Handling

169

170

The Main object includes error handling for common REPL initialization scenarios:

171

172

- **Scala Option Errors**: Command line argument parsing errors are displayed to stderr

173

- **SparkSession Creation Failures**: In shell sessions, initialization errors cause `sys.exit(1)`

174

- **Non-shell Sessions**: Exceptions are propagated to the caller for custom handling

175

176

## Thread Safety Notes

177

178

- Global variables (`sparkContext`, `sparkSession`, `interp`) are not thread-safe

179

- These variables are designed for single-threaded REPL usage

180

- Tests may reset these variables between test cases

181

- The `conf` and `outputDir` values are immutable after initialization