Tessl Tile for maven/org.apache.spark/spark-hive-thriftserver_2.12@3.0.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-spark--spark-hive-thriftserver-2-12

Hive-compatible Thrift server for Spark SQL that enables JDBC/ODBC connectivity

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:maven/org.apache.spark/spark-hive-thriftserver_2.12@3.0.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-hive-thriftserver-2-12@3.0.0

0
# Spark Hive Thrift Server
1

2
The Spark Hive Thrift Server is a component that provides JDBC/ODBC connectivity to Spark SQL through the HiveServer2 protocol. It enables SQL clients to connect to Spark using standard database connectivity protocols while maintaining compatibility with existing Hive-based tools and applications.
3

4
## Package Information
5

6
- **Package Name**: spark-hive-thriftserver_2.12
7
- **Package Type**: maven
8
- **Language**: Scala
9
- **Installation**: Include as a dependency in your Spark application or use the pre-built server
10

11
```xml
12
<dependency>
13
    <groupId>org.apache.spark</groupId>
14
    <artifactId>spark-hive-thriftserver_2.12</artifactId>
15
    <version>3.0.1</version>
16
</dependency>
17
```
18

19
## Core Imports
20

21
```scala
22
import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
23
import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
24
import org.apache.spark.sql.SQLContext
25
```
26

27
## Basic Usage
28

29
### Starting the Thrift Server Programmatically
30

31
```scala
32
import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
33
import org.apache.spark.sql.SparkSession
34

35
// Create Spark session with Hive support
36
val spark = SparkSession.builder()
37
  .appName("ThriftServerExample")
38
  .enableHiveSupport()
39
  .getOrCreate()
40

41
// Start the thrift server with the SQL context
42
val server = HiveThriftServer2.startWithContext(spark.sqlContext)
43
```
44

45
### Using the CLI Driver
46

47
```scala
48
import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
49

50
// Launch interactive SQL CLI
51
SparkSQLCLIDriver.main(Array("--conf", "spark.sql.warehouse.dir=/tmp/warehouse"))
52
```
53

54
### Command Line Usage
55

56
```bash
57
# Start the thrift server
58
./bin/start-thriftserver.sh --hiveconf hive.server2.thrift.port=10000
59

60
# Use beeline to connect
61
./bin/beeline -u jdbc:hive2://localhost:10000
62
```
63

64
## Architecture
65

66
The Spark Hive Thrift Server consists of several key components:
67

68
- **Server Components**: Core thrift server implementation and CLI driver for different usage modes
69
- **Service Layer**: Thrift protocol implementation providing HiveServer2-compatible interface
70
- **Session Management**: Multi-session support with isolated SQL contexts and configuration management
71
- **Operation Management**: SQL execution engine with support for DDL, DML, and metadata operations
72
- **Authentication**: Multiple authentication mechanisms including Kerberos, LDAP, and custom providers
73
- **Web UI**: Built-in monitoring interface for tracking sessions, queries, and performance metrics
74

75
## Capabilities
76

77
### Server Management
78

79
Core server lifecycle management including startup, configuration, and shutdown operations.
80

81
```scala { .api }
82
object HiveThriftServer2 {
83
  def main(args: Array[String]): Unit
84
  def startWithContext(sqlContext: SQLContext): HiveThriftServer2
85
  
86
  var uiTab: Option[ThriftServerTab]
87
  var listener: HiveThriftServer2Listener
88
  var eventManager: HiveThriftServer2EventManager
89
}
90

91
class HiveThriftServer2(sqlContext: SQLContext) extends HiveServer2 {
92
  def init(hiveConf: HiveConf): Unit
93
  def start(): Unit
94
  def stop(): Unit
95
}
96
```
97

98
[Server Management](./server-management.md)
99

100
### CLI Operations
101

102
Command-line interface for interactive SQL query execution and batch processing.
103

104
```scala { .api }
105
object SparkSQLCLIDriver {
106
  def main(args: Array[String]): Unit
107
  def installSignalHandler(): Unit
108
  def isRemoteMode(state: CliSessionState): Boolean
109
}
110

111
class SparkSQLCLIDriver extends CliDriver {
112
  def setHiveVariables(hiveVariables: java.util.Map[String, String]): Unit
113
  def printMasterAndAppId(): Unit
114
  def processCmd(cmd: String): Int
115
  def processLine(line: String, allowInterrupting: Boolean): Int
116
}
117
```
118

119
[CLI Operations](./cli-operations.md)
120

121
### Session Management
122

123
Multi-client session handling with isolated contexts and configuration management.
124

125
```scala { .api }
126
class SparkSQLSessionManager(
127
  hiveServer: HiveServer2, 
128
  sqlContext: SQLContext
129
) extends SessionManager {
130
  def init(hiveConf: HiveConf): Unit
131
  def openSession(
132
    protocol: ThriftserverShimUtils.TProtocolVersion,
133
    username: String,
134
    passwd: String,
135
    ipAddress: String,
136
    sessionConf: java.util.Map[String, String],
137
    withImpersonation: Boolean,
138
    delegationToken: String
139
  ): SessionHandle
140
  def closeSession(sessionHandle: SessionHandle): Unit
141
  def setConfMap(conf: SQLContext, confMap: java.util.Map[String, String]): Unit
142
}
143
```
144

145
[Session Management](./session-management.md)
146

147
### SQL Operations
148

149
SQL statement execution and metadata operations for database introspection.
150

151
```scala { .api }
152
class SparkSQLOperationManager extends OperationManager {
153
  val handleToOperation: JMap[OperationHandle, Operation]
154
  val sessionToContexts: ConcurrentHashMap[SessionHandle, SQLContext]
155
  
156
  def newExecuteStatementOperation(
157
    parentSession: HiveSession,
158
    statement: String,
159
    confOverlay: JMap[String, String],
160
    async: Boolean
161
  ): ExecuteStatementOperation
162
}
163

164
class SparkExecuteStatementOperation(
165
  sqlContext: SQLContext,
166
  parentSession: HiveSession,
167
  statement: String,
168
  confOverlay: JMap[String, String],
169
  runInBackground: Boolean
170
) extends ExecuteStatementOperation
171
```
172

173
[SQL Operations](./sql-operations.md)
174

175
### Metadata Operations
176

177
Database metadata introspection including catalogs, schemas, tables, columns, and functions.
178

179
```scala { .api }
180
class SparkGetCatalogsOperation(
181
  sqlContext: SQLContext,
182
  parentSession: HiveSession
183
) extends GetCatalogsOperation
184

185
class SparkGetSchemasOperation(
186
  sqlContext: SQLContext,
187
  parentSession: HiveSession,
188
  catalogName: String,
189
  schemaName: String
190
) extends GetSchemasOperation
191

192
class SparkGetTablesOperation(
193
  sqlContext: SQLContext,
194
  parentSession: HiveSession,
195
  catalogName: String,
196
  schemaName: String,
197
  tableName: String,
198
  tableTypes: JList[String]
199
) extends MetadataOperation
200
```
201

202
[Metadata Operations](./metadata-operations.md)
203

204
### Web UI Integration
205

206
Built-in web interface for monitoring active sessions, query execution, and server metrics.
207

208
```scala { .api }
209
class ThriftServerTab(
210
  store: HiveThriftServer2AppStatusStore,
211
  sparkUI: SparkUI
212
) extends SparkUITab {
213
  val name: String
214
  def detach(): Unit
215
}
216

217
object ThriftServerTab {
218
  def getSparkUI(sparkContext: SparkContext): SparkUI
219
}
220

221
class HiveThriftServer2Listener extends SparkListener
222
class HiveThriftServer2EventManager
223
class HiveThriftServer2AppStatusStore
224
```
225

226
[Web UI Integration](./web-ui.md)
227

228
## Environment Management
229

230
```scala { .api }
231
object SparkSQLEnv {
232
  var sqlContext: SQLContext
233
  var sparkContext: SparkContext
234
  
235
  def init(): Unit
236
  def stop(): Unit
237
}
238
```
239

240
## Service Layer
241

242
```scala { .api }
243
class SparkSQLCLIService(
244
  hiveServer: HiveServer2,
245
  sqlContext: SQLContext
246
) extends CLIService {
247
  def init(hiveConf: HiveConf): Unit
248
  def getInfo(sessionHandle: SessionHandle, getInfoType: GetInfoType): GetInfoValue
249
}
250
```
251

252
## Utility Components
253

254
```scala { .api }
255
// Base trait for Spark operations with session management
256
trait SparkOperation extends Operation with Logging {
257
  protected def sqlContext: SQLContext
258
  protected var statementId: String
259
  protected def cleanup(): Unit
260
  
261
  def withLocalProperties[T](f: => T): T
262
  def tableTypeString(tableType: CatalogTableType): String
263
}
264

265
// Reflection utilities for internal server operations  
266
object ReflectionUtils {
267
  def setSuperField(obj: Object, fieldName: String, fieldValue: Object): Unit
268
  def setAncestorField(obj: AnyRef, level: Int, fieldName: String, fieldValue: AnyRef): Unit
269
  def getSuperField[T](obj: AnyRef, fieldName: String): T
270
  def getAncestorField[T](clazz: Object, level: Int, fieldName: String): T
271
  def invokeStatic(clazz: Class[_], methodName: String, args: (Class[_], AnyRef)*): AnyRef
272
  def invoke(clazz: Class[_], obj: AnyRef, methodName: String, args: (Class[_], AnyRef)*): AnyRef
273
}
274
```
275

276
## Common Types
277

278
```scala { .api }
279
// From Hive libraries
280
import org.apache.hive.service.cli.SessionHandle
281
import org.apache.hive.service.cli.OperationHandle
282
import org.apache.hadoop.hive.conf.HiveConf
283
import org.apache.hive.service.cli.session.HiveSession
284

285
// Execution states
286
object ExecutionState extends Enumeration {
287
  val STARTED, COMPILED, CANCELED, FAILED, FINISHED, CLOSED = Value
288
  type ExecutionState = Value
289
}
290

291
// Authentication and transport
292
import org.apache.hive.service.cli.thrift.ThriftBinaryCLIService
293
import org.apache.hive.service.cli.thrift.ThriftHttpCLIService
294
```