Tessl Tile for maven/org.apache.spark/spark-hive-thriftserver_2.12@2.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli-operations.md environment-management.md index.md operation-management.md server-management.md service-management.md session-management.md sql-execution.md ui-components.md

environment-management.mddocs/

0
# Environment Management
1

2
Singleton environment management for Spark and SQL contexts, providing initialization and cleanup operations.
3

4
## Capabilities
5

6
### SparkSQLEnv Object
7

8
Singleton object for managing the master program's SparkSQL environment. Provides centralized initialization and cleanup of Spark and SQL contexts.
9

10
```scala { .api }
11
/**
12
 * A singleton object for the master program. The slaves should not access this.
13
 */
14
private[hive] object SparkSQLEnv extends Logging {
15
  /** The current SQL context - null until initialized */
16
  var sqlContext: SQLContext
17
  /** The current Spark context - null until initialized */
18
  var sparkContext: SparkContext
19
  
20
  /**
21
   * Initialize the SparkSQL environment.
22
   * Creates SparkSession with Hive support and sets up contexts.
23
   * Safe to call multiple times - will only initialize once.
24
   */
25
  def init(): Unit
26
  
27
  /**
28
   * Cleans up and shuts down the Spark SQL environments.
29
   * Stops the SparkContext and nullifies references.
30
   */
31
  def stop(): Unit
32
}
33
```
34

35
**Usage Examples:**
36

37
```scala
38
import org.apache.spark.sql.hive.thriftserver.SparkSQLEnv
39

40
// Initialize the environment - must be called before using contexts
41
SparkSQLEnv.init()
42

43
// Access the SQL context
44
val sqlContext = SparkSQLEnv.sqlContext
45
val df = sqlContext.sql("SELECT * FROM my_table")
46

47
// Access the Spark context
48
val sparkContext = SparkSQLEnv.sparkContext
49
println(s"Application ID: ${sparkContext.applicationId}")
50

51
// Clean shutdown when done
52
SparkSQLEnv.stop()
53
```
54

55
### Environment Initialization Details
56

57
The `init()` method performs the following initialization steps:
58

59
1. **Spark Configuration**: Creates a SparkConf with `loadDefaults = true`
60
2. **Application Naming**: Sets application name to `SparkSQL::<hostname>` if not specified
61
3. **Hive Support**: Enables Hive support in the SparkSession
62
4. **Context Setup**: Initializes both SparkContext and SQLContext references
63
5. **Session State**: Forces initialization of session state with proper class loader
64
6. **Metadata Setup**: Configures Hive metadata client with proper output streams
65
7. **Version Configuration**: Sets fake Hive version for compatibility
66

67
### Environment Configuration
68

69
The environment respects standard Spark configuration properties:
70

71
**Application Configuration:**
72
```scala
73
// Custom application name (optional)
74
sparkConf.set("spark.app.name", "MyThriftServer")
75

76
// Enable/disable Spark UI
77
sparkConf.set("spark.ui.enabled", "true")
78
```
79

80
**Hive Configuration:**
81
```scala
82
// Hive warehouse directory
83
sparkConf.set("spark.sql.warehouse.dir", "/path/to/warehouse")
84

85
// Hive metastore URIs
86
sparkConf.set("spark.sql.hive.metastore.uris", "thrift://localhost:9083")
87
```
88

89
**Session Configuration:**
90
```scala
91
// Single session mode (share contexts across sessions)
92
sparkConf.set("spark.sql.hive.thriftServer.singleSession", "false")
93
```
94

95
### Lifecycle Management
96

97
**Initialization Safety:**
98
- `init()` can be called multiple times safely
99
- Only initializes once, subsequent calls are no-ops
100
- Thread-safe for concurrent access
101

102
**Shutdown Behavior:**
103
- `stop()` shuts down SparkContext if it exists
104
- Nullifies context references to prevent memory leaks
105
- Integrates with shutdown hooks for clean exit
106

107
**State Checking:**
108
```scala
109
// Check if environment is initialized
110
if (SparkSQLEnv.sparkContext != null) {
111
  // Environment is ready
112
  println("SparkSQL environment is initialized")
113
} else {
114
  // Need to initialize first
115
  SparkSQLEnv.init()
116
}
117

118
// Check if environment is stopped
119
if (SparkSQLEnv.sparkContext.isStopped) {
120
  println("SparkContext has been stopped")
121
}
122
```
123

124
### Integration with Other Components
125

126
The SparkSQLEnv is used throughout the thrift server components:
127

128
**CLI Integration:**
129
```scala
130
// CLI driver uses environment contexts
131
class SparkSQLCLIDriver {
132
  // Forces initialization if not in remote mode
133
  if (!isRemoteMode) {
134
    SparkSQLEnv.init()
135
  }
136
}
137
```
138

139
**Server Integration:**
140
```scala
141
// Server main method initializes environment
142
HiveThriftServer2.main(args: Array[String]) {
143
  SparkSQLEnv.init()
144
  // ... server setup
145
}
146
```
147

148
**Driver Integration:**
149
```scala
150
// SQL driver defaults to environment context
151
class SparkSQLDriver(val context: SQLContext = SparkSQLEnv.sqlContext)
152
```
153

154
### Error Handling
155

156
Common environment initialization errors:
157

158
- **Configuration Errors**: Invalid Spark configuration parameters
159
- **Hive Errors**: Metastore connection or configuration issues
160
- **Resource Errors**: Insufficient memory or unable to acquire resources
161
- **Classpath Errors**: Missing dependencies or version conflicts
162

163
The environment provides logging for troubleshooting initialization issues and integrates with Spark's standard error handling mechanisms.

Version

Tile

Files

environment-management.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

environment-management.mddocs/