or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-spark--spark-hive-thriftserver-2-12

Hive-compatible Thrift server for Spark SQL that enables JDBC/ODBC connectivity

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-hive-thriftserver_2.12@3.0.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-hive-thriftserver-2-12@3.0.0

0

# Spark Hive Thrift Server

1

2

The Spark Hive Thrift Server is a component that provides JDBC/ODBC connectivity to Spark SQL through the HiveServer2 protocol. It enables SQL clients to connect to Spark using standard database connectivity protocols while maintaining compatibility with existing Hive-based tools and applications.

3

4

## Package Information

5

6

- **Package Name**: spark-hive-thriftserver_2.12

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Installation**: Include as a dependency in your Spark application or use the pre-built server

10

11

```xml

12

<dependency>

13

<groupId>org.apache.spark</groupId>

14

<artifactId>spark-hive-thriftserver_2.12</artifactId>

15

<version>3.0.1</version>

16

</dependency>

17

```

18

19

## Core Imports

20

21

```scala

22

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2

23

import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

24

import org.apache.spark.sql.SQLContext

25

```

26

27

## Basic Usage

28

29

### Starting the Thrift Server Programmatically

30

31

```scala

32

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2

33

import org.apache.spark.sql.SparkSession

34

35

// Create Spark session with Hive support

36

val spark = SparkSession.builder()

37

.appName("ThriftServerExample")

38

.enableHiveSupport()

39

.getOrCreate()

40

41

// Start the thrift server with the SQL context

42

val server = HiveThriftServer2.startWithContext(spark.sqlContext)

43

```

44

45

### Using the CLI Driver

46

47

```scala

48

import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

49

50

// Launch interactive SQL CLI

51

SparkSQLCLIDriver.main(Array("--conf", "spark.sql.warehouse.dir=/tmp/warehouse"))

52

```

53

54

### Command Line Usage

55

56

```bash

57

# Start the thrift server

58

./bin/start-thriftserver.sh --hiveconf hive.server2.thrift.port=10000

59

60

# Use beeline to connect

61

./bin/beeline -u jdbc:hive2://localhost:10000

62

```

63

64

## Architecture

65

66

The Spark Hive Thrift Server consists of several key components:

67

68

- **Server Components**: Core thrift server implementation and CLI driver for different usage modes

69

- **Service Layer**: Thrift protocol implementation providing HiveServer2-compatible interface

70

- **Session Management**: Multi-session support with isolated SQL contexts and configuration management

71

- **Operation Management**: SQL execution engine with support for DDL, DML, and metadata operations

72

- **Authentication**: Multiple authentication mechanisms including Kerberos, LDAP, and custom providers

73

- **Web UI**: Built-in monitoring interface for tracking sessions, queries, and performance metrics

74

75

## Capabilities

76

77

### Server Management

78

79

Core server lifecycle management including startup, configuration, and shutdown operations.

80

81

```scala { .api }

82

object HiveThriftServer2 {

83

def main(args: Array[String]): Unit

84

def startWithContext(sqlContext: SQLContext): HiveThriftServer2

85

86

var uiTab: Option[ThriftServerTab]

87

var listener: HiveThriftServer2Listener

88

var eventManager: HiveThriftServer2EventManager

89

}

90

91

class HiveThriftServer2(sqlContext: SQLContext) extends HiveServer2 {

92

def init(hiveConf: HiveConf): Unit

93

def start(): Unit

94

def stop(): Unit

95

}

96

```

97

98

[Server Management](./server-management.md)

99

100

### CLI Operations

101

102

Command-line interface for interactive SQL query execution and batch processing.

103

104

```scala { .api }

105

object SparkSQLCLIDriver {

106

def main(args: Array[String]): Unit

107

def installSignalHandler(): Unit

108

def isRemoteMode(state: CliSessionState): Boolean

109

}

110

111

class SparkSQLCLIDriver extends CliDriver {

112

def setHiveVariables(hiveVariables: java.util.Map[String, String]): Unit

113

def printMasterAndAppId(): Unit

114

def processCmd(cmd: String): Int

115

def processLine(line: String, allowInterrupting: Boolean): Int

116

}

117

```

118

119

[CLI Operations](./cli-operations.md)

120

121

### Session Management

122

123

Multi-client session handling with isolated contexts and configuration management.

124

125

```scala { .api }

126

class SparkSQLSessionManager(

127

hiveServer: HiveServer2,

128

sqlContext: SQLContext

129

) extends SessionManager {

130

def init(hiveConf: HiveConf): Unit

131

def openSession(

132

protocol: ThriftserverShimUtils.TProtocolVersion,

133

username: String,

134

passwd: String,

135

ipAddress: String,

136

sessionConf: java.util.Map[String, String],

137

withImpersonation: Boolean,

138

delegationToken: String

139

): SessionHandle

140

def closeSession(sessionHandle: SessionHandle): Unit

141

def setConfMap(conf: SQLContext, confMap: java.util.Map[String, String]): Unit

142

}

143

```

144

145

[Session Management](./session-management.md)

146

147

### SQL Operations

148

149

SQL statement execution and metadata operations for database introspection.

150

151

```scala { .api }

152

class SparkSQLOperationManager extends OperationManager {

153

val handleToOperation: JMap[OperationHandle, Operation]

154

val sessionToContexts: ConcurrentHashMap[SessionHandle, SQLContext]

155

156

def newExecuteStatementOperation(

157

parentSession: HiveSession,

158

statement: String,

159

confOverlay: JMap[String, String],

160

async: Boolean

161

): ExecuteStatementOperation

162

}

163

164

class SparkExecuteStatementOperation(

165

sqlContext: SQLContext,

166

parentSession: HiveSession,

167

statement: String,

168

confOverlay: JMap[String, String],

169

runInBackground: Boolean

170

) extends ExecuteStatementOperation

171

```

172

173

[SQL Operations](./sql-operations.md)

174

175

### Metadata Operations

176

177

Database metadata introspection including catalogs, schemas, tables, columns, and functions.

178

179

```scala { .api }

180

class SparkGetCatalogsOperation(

181

sqlContext: SQLContext,

182

parentSession: HiveSession

183

) extends GetCatalogsOperation

184

185

class SparkGetSchemasOperation(

186

sqlContext: SQLContext,

187

parentSession: HiveSession,

188

catalogName: String,

189

schemaName: String

190

) extends GetSchemasOperation

191

192

class SparkGetTablesOperation(

193

sqlContext: SQLContext,

194

parentSession: HiveSession,

195

catalogName: String,

196

schemaName: String,

197

tableName: String,

198

tableTypes: JList[String]

199

) extends MetadataOperation

200

```

201

202

[Metadata Operations](./metadata-operations.md)

203

204

### Web UI Integration

205

206

Built-in web interface for monitoring active sessions, query execution, and server metrics.

207

208

```scala { .api }

209

class ThriftServerTab(

210

store: HiveThriftServer2AppStatusStore,

211

sparkUI: SparkUI

212

) extends SparkUITab {

213

val name: String

214

def detach(): Unit

215

}

216

217

object ThriftServerTab {

218

def getSparkUI(sparkContext: SparkContext): SparkUI

219

}

220

221

class HiveThriftServer2Listener extends SparkListener

222

class HiveThriftServer2EventManager

223

class HiveThriftServer2AppStatusStore

224

```

225

226

[Web UI Integration](./web-ui.md)

227

228

## Environment Management

229

230

```scala { .api }

231

object SparkSQLEnv {

232

var sqlContext: SQLContext

233

var sparkContext: SparkContext

234

235

def init(): Unit

236

def stop(): Unit

237

}

238

```

239

240

## Service Layer

241

242

```scala { .api }

243

class SparkSQLCLIService(

244

hiveServer: HiveServer2,

245

sqlContext: SQLContext

246

) extends CLIService {

247

def init(hiveConf: HiveConf): Unit

248

def getInfo(sessionHandle: SessionHandle, getInfoType: GetInfoType): GetInfoValue

249

}

250

```

251

252

## Utility Components

253

254

```scala { .api }

255

// Base trait for Spark operations with session management

256

trait SparkOperation extends Operation with Logging {

257

protected def sqlContext: SQLContext

258

protected var statementId: String

259

protected def cleanup(): Unit

260

261

def withLocalProperties[T](f: => T): T

262

def tableTypeString(tableType: CatalogTableType): String

263

}

264

265

// Reflection utilities for internal server operations

266

object ReflectionUtils {

267

def setSuperField(obj: Object, fieldName: String, fieldValue: Object): Unit

268

def setAncestorField(obj: AnyRef, level: Int, fieldName: String, fieldValue: AnyRef): Unit

269

def getSuperField[T](obj: AnyRef, fieldName: String): T

270

def getAncestorField[T](clazz: Object, level: Int, fieldName: String): T

271

def invokeStatic(clazz: Class[_], methodName: String, args: (Class[_], AnyRef)*): AnyRef

272

def invoke(clazz: Class[_], obj: AnyRef, methodName: String, args: (Class[_], AnyRef)*): AnyRef

273

}

274

```

275

276

## Common Types

277

278

```scala { .api }

279

// From Hive libraries

280

import org.apache.hive.service.cli.SessionHandle

281

import org.apache.hive.service.cli.OperationHandle

282

import org.apache.hadoop.hive.conf.HiveConf

283

import org.apache.hive.service.cli.session.HiveSession

284

285

// Execution states

286

object ExecutionState extends Enumeration {

287

val STARTED, COMPILED, CANCELED, FAILED, FINISHED, CLOSED = Value

288

type ExecutionState = Value

289

}

290

291

// Authentication and transport

292

import org.apache.hive.service.cli.thrift.ThriftBinaryCLIService

293

import org.apache.hive.service.cli.thrift.ThriftHttpCLIService

294

```