Tessl Tile for maven/org.apache.spark/spark-hive-thriftserver_2.11@1.6.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli-interface.md environment-management.md index.md monitoring-ui.md query-execution.md server-management.md session-management.md

index.mddocs/

0
# Spark Hive Thrift Server
1

2
Spark Hive Thrift Server provides a Thrift-based JDBC/ODBC interface for Spark SQL, making it compatible with HiveServer2 clients. It enables remote access to Spark SQL through standard database connectivity protocols, allowing users to connect using JDBC drivers and execute SQL queries against Spark datasets and tables.
3

4
The server implements the HiveServer2 thrift interface but uses Spark SQL as the execution engine instead of Hive, providing better performance and broader data source support. It includes support for concurrent sessions, query execution management, and a web UI for monitoring active connections and queries.
5

6
## Package Information
7

8
- **Package Name**: spark-hive-thriftserver_2.11
9
- **Package Type**: Maven
10
- **Language**: Scala
11
- **Group ID**: org.apache.spark
12
- **Version**: 1.6.3
13
- **Installation**: Add Maven dependency or use pre-built Spark distribution
14

15
## Core Imports
16

17
```scala
18
import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
19
import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
20
import org.apache.spark.sql.hive.thriftserver.SparkSQLEnv
21
import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIService
22
import org.apache.spark.sql.hive.thriftserver.ReflectionUtils
23
import org.apache.spark.sql.hive.HiveContext
24
```
25

26
## Basic Usage
27

28
### Server Mode
29
```scala
30
// Start thrift server standalone
31
object MyThriftServer extends App {
32
  HiveThriftServer2.main(args)
33
}
34

35
// Or programmatically with existing context
36
import org.apache.spark.sql.hive.HiveContext
37

38
val hiveContext = new HiveContext(sparkContext)
39
HiveThriftServer2.startWithContext(hiveContext)
40
```
41

42
### CLI Mode
43
```scala
44
// Start interactive SQL CLI
45
object MySQLCLI extends App {
46
  SparkSQLCLIDriver.main(args)
47
}
48
```
49

50
### Environment Setup
51
```scala
52
// Initialize Spark SQL environment
53
SparkSQLEnv.init()
54

55
// Access shared contexts
56
val sparkContext = SparkSQLEnv.sparkContext
57
val hiveContext = SparkSQLEnv.hiveContext
58

59
// Clean shutdown
60
SparkSQLEnv.stop()
61
```
62

63
## Architecture
64

65
The Spark Hive Thrift Server is built around several key components:
66

67
- **Server Entry Points**: `HiveThriftServer2` and `SparkSQLCLIDriver` provide main application entry points for server and CLI modes
68
- **Environment Management**: `SparkSQLEnv` manages shared Spark and Hive contexts with optimal configurations
69
- **Session Management**: `SparkSQLSessionManager` handles client session lifecycle and isolation
70
- **Query Execution**: `SparkExecuteStatementOperation` and `SparkSQLDriver` process SQL statements and manage results
71
- **Service Layer**: `SparkSQLCLIService` implements the Thrift service interface compatible with HiveServer2
72
- **Web UI Integration**: Monitoring and statistics through Spark's web UI with dedicated JDBC/ODBC server tab
73
- **Reflection Utilities**: `ReflectionUtils` provides compatibility layer for Hive integration
74

75
## Capabilities
76

77
### Server Management
78

79
Core server lifecycle management including startup, configuration, and shutdown operations.
80

81
```scala { .api }
82
object HiveThriftServer2 {
83
  def main(args: Array[String]): Unit
84
  
85
  @DeveloperApi
86
  def startWithContext(sqlContext: HiveContext): Unit
87
  
88
  var LOG: Log
89
  var uiTab: Option[ThriftServerTab]
90
  var listener: HiveThriftServer2Listener
91
}
92
```
93

94
[Server Management](./server-management.md)
95

96
### CLI Interface
97

98
Interactive command-line interface for executing SQL queries with Hive CLI compatibility.
99

100
```scala { .api }
101
object SparkSQLCLIDriver {
102
  def main(args: Array[String]): Unit
103
  def installSignalHandler(): Unit
104
}
105

106
private[hive] class SparkSQLCLIDriver extends CliDriver {
107
  override def processCmd(cmd: String): Int
108
}
109
```
110

111
[CLI Interface](./cli-interface.md)
112

113
### Environment Management
114

115
Centralized management of Spark and Hive execution contexts with optimized configurations.
116

117
```scala { .api }
118
object SparkSQLEnv {
119
  var hiveContext: HiveContext
120
  var sparkContext: SparkContext
121
  
122
  def init(): Unit
123
  def stop(): Unit
124
}
125
```
126

127
[Environment Management](./environment-management.md)
128

129
### Session Management
130

131
Client session lifecycle management with isolation and resource cleanup.
132

133
```scala { .api }
134
private[hive] class SparkSQLSessionManager(
135
  hiveServer: HiveServer2, 
136
  hiveContext: HiveContext
137
) extends SessionManager {
138
  override def openSession(...): SessionHandle
139
  override def closeSession(sessionHandle: SessionHandle): Unit
140
}
141
```
142

143
[Session Management](./session-management.md)
144

145
### Query Execution
146

147
SQL statement execution with result management and schema introspection.
148

149
```scala { .api }
150
private[hive] class SparkExecuteStatementOperation(
151
  parentSession: HiveSession,
152
  statement: String,
153
  confOverlay: JMap[String, String],
154
  runInBackground: Boolean
155
) extends ExecuteStatementOperation {
156
  def close(): Unit
157
  def getNextRowSet(order: FetchOrientation, maxRowsL: Long): RowSet
158
  def getResultSetSchema: TableSchema
159
  def cancel(): Unit
160
}
161

162
private[hive] class SparkSQLDriver(
163
  context: HiveContext = SparkSQLEnv.hiveContext
164
) extends Driver {
165
  def init(): Unit
166
  def run(command: String): CommandProcessorResponse
167
  def close(): Int
168
  def getResults(res: JList[_]): Boolean
169
  def getSchema: Schema
170
  def destroy(): Unit
171
}
172
```
173

174
[Query Execution](./query-execution.md)
175

176
### Monitoring and UI
177

178
Web-based monitoring interface with session tracking and query statistics.
179

180
```scala { .api }
181
private[thriftserver] class HiveThriftServer2Listener(
182
  server: HiveServer2,
183
  conf: SQLConf
184
) extends SparkListener {
185
  def getOnlineSessionNum: Int
186
  def getTotalRunning: Int
187
  def getSessionList: Seq[SessionInfo]
188
  def getSession(sessionId: String): Option[SessionInfo]
189
  def getExecutionList: Seq[ExecutionInfo]
190
}
191

192
private[thriftserver] class ThriftServerTab(
193
  sparkContext: SparkContext
194
) extends SparkUITab {
195
  def detach(): Unit
196
}
197
```
198

199
[Monitoring and UI](./monitoring-ui.md)
200

201
### Service Layer Integration
202

203
Core Thrift service implementation providing HiveServer2 compatibility layer.
204

205
```scala { .api }
206
private[hive] class SparkSQLCLIService(
207
  hiveServer: HiveServer2,
208
  hiveContext: HiveContext
209
) extends CLIService(hiveServer) {
210
  override def init(hiveConf: HiveConf): Unit
211
  override def start(): Unit
212
  override def stop(): Unit
213
}
214
```
215

216
### Reflection Utilities
217

218
Utility methods for accessing private fields and methods in Hive classes for compatibility.
219

220
```scala { .api }
221
private[hive] object ReflectionUtils {
222
  def setSuperField(obj: Object, fieldName: String, fieldValue: Object): Unit
223
  def setAncestorField(obj: AnyRef, level: Int, fieldName: String, fieldValue: AnyRef): Unit
224
  def getSuperField[T](obj: AnyRef, fieldName: String): T
225
  def getAncestorField[T](clazz: Object, level: Int, fieldName: String): T
226
  def invokeStatic(clazz: Class[_], methodName: String, args: (Class[_], AnyRef)*): AnyRef
227
  def invoke(clazz: Class[_], obj: AnyRef, methodName: String, args: (Class[_], AnyRef)*): AnyRef
228
}
229
```
230

231
## Configuration
232

233
### Spark Configuration Properties
234
- `spark.app.name` - Application name (default: "SparkSQL::{hostname}")
235
- `spark.serializer` - Serializer class (default: KryoSerializer)
236
- `spark.kryo.referenceTracking` - Kryo reference tracking (default: false)
237
- `spark.ui.enabled` - Enable Spark Web UI (default: true)
238

239
### Hive Server Configuration Properties
240
- `hive.server2.transport.mode` - Transport mode ("binary" or "http")
241
- `hive.server2.async.exec.threads` - Background execution thread pool size
242
- `hive.server2.logging.operation.enabled` - Enable operation logging
243

244
### SQL Configuration Properties
245
- `SQLConf.THRIFTSERVER_POOL.key` - Scheduler pool for query execution
246
- `SQLConf.THRIFTSERVER_UI_STATEMENT_LIMIT` - Maximum statements retained in UI
247
- `SQLConf.THRIFTSERVER_UI_SESSION_LIMIT` - Maximum sessions retained in UI
248

249
## Common Types
250

251
```scala { .api }
252
private[thriftserver] class SessionInfo(
253
  val sessionId: String,
254
  val startTimestamp: Long,
255
  val ip: String,
256
  val userName: String
257
) {
258
  var finishTimestamp: Long
259
  var totalExecution: Int
260
  def totalTime: Long
261
}
262

263
private[thriftserver] class ExecutionInfo(
264
  val statement: String,
265
  val sessionId: String,
266
  val startTimestamp: Long,
267
  val userName: String
268
) {
269
  var finishTimestamp: Long
270
  var executePlan: String
271
  var detail: String
272
  var state: ExecutionState.Value
273
  val jobId: ArrayBuffer[String]
274
  var groupId: String
275
  def totalTime: Long
276
}
277

278
private[thriftserver] object ExecutionState extends Enumeration {
279
  val STARTED, COMPILED, FAILED, FINISHED = Value
280
  type ExecutionState = Value
281
}
282
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/