or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli-operations.mdenvironment-management.mdindex.mdoperation-management.mdserver-management.mdservice-management.mdsession-management.mdsql-execution.mdui-components.md

environment-management.mddocs/

0

# Environment Management

1

2

Singleton environment management for Spark and SQL contexts, providing initialization and cleanup operations.

3

4

## Capabilities

5

6

### SparkSQLEnv Object

7

8

Singleton object for managing the master program's SparkSQL environment. Provides centralized initialization and cleanup of Spark and SQL contexts.

9

10

```scala { .api }

11

/**

12

* A singleton object for the master program. The slaves should not access this.

13

*/

14

private[hive] object SparkSQLEnv extends Logging {

15

/** The current SQL context - null until initialized */

16

var sqlContext: SQLContext

17

/** The current Spark context - null until initialized */

18

var sparkContext: SparkContext

19

20

/**

21

* Initialize the SparkSQL environment.

22

* Creates SparkSession with Hive support and sets up contexts.

23

* Safe to call multiple times - will only initialize once.

24

*/

25

def init(): Unit

26

27

/**

28

* Cleans up and shuts down the Spark SQL environments.

29

* Stops the SparkContext and nullifies references.

30

*/

31

def stop(): Unit

32

}

33

```

34

35

**Usage Examples:**

36

37

```scala

38

import org.apache.spark.sql.hive.thriftserver.SparkSQLEnv

39

40

// Initialize the environment - must be called before using contexts

41

SparkSQLEnv.init()

42

43

// Access the SQL context

44

val sqlContext = SparkSQLEnv.sqlContext

45

val df = sqlContext.sql("SELECT * FROM my_table")

46

47

// Access the Spark context

48

val sparkContext = SparkSQLEnv.sparkContext

49

println(s"Application ID: ${sparkContext.applicationId}")

50

51

// Clean shutdown when done

52

SparkSQLEnv.stop()

53

```

54

55

### Environment Initialization Details

56

57

The `init()` method performs the following initialization steps:

58

59

1. **Spark Configuration**: Creates a SparkConf with `loadDefaults = true`

60

2. **Application Naming**: Sets application name to `SparkSQL::<hostname>` if not specified

61

3. **Hive Support**: Enables Hive support in the SparkSession

62

4. **Context Setup**: Initializes both SparkContext and SQLContext references

63

5. **Session State**: Forces initialization of session state with proper class loader

64

6. **Metadata Setup**: Configures Hive metadata client with proper output streams

65

7. **Version Configuration**: Sets fake Hive version for compatibility

66

67

### Environment Configuration

68

69

The environment respects standard Spark configuration properties:

70

71

**Application Configuration:**

72

```scala

73

// Custom application name (optional)

74

sparkConf.set("spark.app.name", "MyThriftServer")

75

76

// Enable/disable Spark UI

77

sparkConf.set("spark.ui.enabled", "true")

78

```

79

80

**Hive Configuration:**

81

```scala

82

// Hive warehouse directory

83

sparkConf.set("spark.sql.warehouse.dir", "/path/to/warehouse")

84

85

// Hive metastore URIs

86

sparkConf.set("spark.sql.hive.metastore.uris", "thrift://localhost:9083")

87

```

88

89

**Session Configuration:**

90

```scala

91

// Single session mode (share contexts across sessions)

92

sparkConf.set("spark.sql.hive.thriftServer.singleSession", "false")

93

```

94

95

### Lifecycle Management

96

97

**Initialization Safety:**

98

- `init()` can be called multiple times safely

99

- Only initializes once, subsequent calls are no-ops

100

- Thread-safe for concurrent access

101

102

**Shutdown Behavior:**

103

- `stop()` shuts down SparkContext if it exists

104

- Nullifies context references to prevent memory leaks

105

- Integrates with shutdown hooks for clean exit

106

107

**State Checking:**

108

```scala

109

// Check if environment is initialized

110

if (SparkSQLEnv.sparkContext != null) {

111

// Environment is ready

112

println("SparkSQL environment is initialized")

113

} else {

114

// Need to initialize first

115

SparkSQLEnv.init()

116

}

117

118

// Check if environment is stopped

119

if (SparkSQLEnv.sparkContext.isStopped) {

120

println("SparkContext has been stopped")

121

}

122

```

123

124

### Integration with Other Components

125

126

The SparkSQLEnv is used throughout the thrift server components:

127

128

**CLI Integration:**

129

```scala

130

// CLI driver uses environment contexts

131

class SparkSQLCLIDriver {

132

// Forces initialization if not in remote mode

133

if (!isRemoteMode) {

134

SparkSQLEnv.init()

135

}

136

}

137

```

138

139

**Server Integration:**

140

```scala

141

// Server main method initializes environment

142

HiveThriftServer2.main(args: Array[String]) {

143

SparkSQLEnv.init()

144

// ... server setup

145

}

146

```

147

148

**Driver Integration:**

149

```scala

150

// SQL driver defaults to environment context

151

class SparkSQLDriver(val context: SQLContext = SparkSQLEnv.sqlContext)

152

```

153

154

### Error Handling

155

156

Common environment initialization errors:

157

158

- **Configuration Errors**: Invalid Spark configuration parameters

159

- **Hive Errors**: Metastore connection or configuration issues

160

- **Resource Errors**: Insufficient memory or unable to acquire resources

161

- **Classpath Errors**: Missing dependencies or version conflicts

162

163

The environment provides logging for troubleshooting initialization issues and integrates with Spark's standard error handling mechanisms.