or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-spark--spark-hive-thriftserver_2-11

Spark Project Hive Thrift Server - A Thrift server implementation that provides JDBC/ODBC access to Spark SQL

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-hive-thriftserver_2.11@2.4.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-hive-thriftserver_2-11@2.4.0

0

# Spark Hive Thrift Server

1

2

Apache Spark Hive Thrift Server provides JDBC/ODBC access to Spark SQL through the HiveServer2 protocol, enabling remote clients to execute SQL queries against Spark clusters using standard database connectivity tools and BI applications.

3

4

## Package Information

5

6

- **Package Name**: spark-hive-thriftserver_2.11

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Artifact ID**: org.apache.spark:spark-hive-thriftserver_2.11:2.4.8

10

- **Installation**: Include as Maven dependency or part of Spark distribution

11

12

## Core Imports

13

14

```scala

15

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2

16

import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver

17

import org.apache.spark.sql.SQLContext

18

```

19

20

## Basic Usage

21

22

### Starting the Thrift Server Programmatically

23

24

```scala

25

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2

26

import org.apache.spark.sql.SQLContext

27

import org.apache.spark.SparkContext

28

import org.apache.spark.SparkConf

29

30

// Create Spark SQL context

31

val conf = new SparkConf().setAppName("ThriftServer")

32

val sc = new SparkContext(conf)

33

val sqlContext = new SQLContext(sc)

34

35

// Start the thrift server

36

HiveThriftServer2.startWithContext(sqlContext)

37

```

38

39

### Starting from Command Line

40

41

```bash

42

# Start Thrift Server

43

$SPARK_HOME/sbin/start-thriftserver.sh --master spark://master:7077

44

45

# Start CLI

46

$SPARK_HOME/bin/spark-sql

47

```

48

49

## Architecture

50

51

The Spark Hive Thrift Server consists of several key components:

52

53

- **HiveThriftServer2**: Main server entry point and lifecycle management

54

- **Service Layer**: CLI service, session management, and operation handling

55

- **Transport Layer**: HTTP and binary Thrift protocol support

56

- **Web UI**: Monitoring interface for sessions and queries

57

- **Authentication**: Kerberos and delegation token support

58

59

## Capabilities

60

61

### Server Management

62

63

Main entry points for starting and managing the Thrift Server with lifecycle control and configuration.

64

65

```scala { .api }

66

object HiveThriftServer2 {

67

def startWithContext(sqlContext: SQLContext): Unit

68

def main(args: Array[String]): Unit

69

var uiTab: Option[ThriftServerTab]

70

var listener: HiveThriftServer2Listener

71

}

72

```

73

74

[Server Management](./server-management.md)

75

76

### CLI Interface

77

78

Command-line interface for interactive SQL execution with Spark SQL integration.

79

80

```scala { .api }

81

object SparkSQLCLIDriver {

82

def main(args: Array[String]): Unit

83

def installSignalHandler(): Unit

84

}

85

```

86

87

[CLI Interface](./cli-interface.md)

88

89

### Session Management

90

91

Session lifecycle management with SQL context handling and client connection management.

92

93

```scala { .api }

94

class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext: SQLContext) extends SessionManager {

95

def openSession(protocol: TProtocolVersion, username: String, passwd: String,

96

ipAddress: String, sessionConf: java.util.Map[String, String],

97

withImpersonation: Boolean, delegationToken: String): SessionHandle

98

def closeSession(sessionHandle: SessionHandle): Unit

99

}

100

```

101

102

[Session Management](./session-management.md)

103

104

### Query Operations

105

106

SQL query execution operations with result handling and asynchronous processing support.

107

108

```scala { .api }

109

class SparkSQLOperationManager extends OperationManager {

110

val sessionToActivePool: ConcurrentHashMap[SessionHandle, String]

111

val sessionToContexts: ConcurrentHashMap[SessionHandle, SQLContext]

112

def newExecuteStatementOperation(parentSession: HiveSession, statement: String,

113

confOverlay: JMap[String, String], async: Boolean): ExecuteStatementOperation

114

}

115

```

116

117

[Query Operations](./query-operations.md)

118

119

### Web UI Monitoring

120

121

Web-based monitoring interface for active sessions, query execution, and server performance metrics.

122

123

```scala { .api }

124

class ThriftServerTab(sparkContext: SparkContext) extends SparkUITab {

125

val name: String = "JDBC/ODBC Server"

126

def detach(): Unit

127

}

128

```

129

130

[Web UI Monitoring](./web-ui-monitoring.md)

131

132

### Environment Management

133

134

Spark SQL environment initialization and cleanup with configuration management.

135

136

```scala { .api }

137

object SparkSQLEnv {

138

var sqlContext: SQLContext

139

var sparkContext: SparkContext

140

def init(): Unit

141

def stop(): Unit

142

}

143

```

144

145

[Environment Management](./environment-management.md)

146

147

## Types

148

149

### Core Types

150

151

```scala { .api }

152

// Session information tracking

153

class SessionInfo(sessionId: String, startTimestamp: Long, ip: String, userName: String) {

154

var finishTimestamp: Long

155

var totalExecution: Int

156

def totalTime: Long

157

}

158

159

// Query execution tracking

160

class ExecutionInfo(statement: String, sessionId: String, startTimestamp: Long, userName: String) {

161

var finishTimestamp: Long

162

var executePlan: String

163

var detail: String

164

var state: ExecutionState.Value

165

val jobId: ArrayBuffer[String]

166

var groupId: String

167

def totalTime: Long

168

}

169

170

// Execution states

171

object ExecutionState extends Enumeration {

172

val STARTED, COMPILED, FAILED, FINISHED = Value

173

type ExecutionState = Value

174

}

175

176

// Server listener for events

177

class HiveThriftServer2Listener(server: HiveServer2, conf: SQLConf) extends SparkListener {

178

def getOnlineSessionNum: Int

179

def getTotalRunning: Int

180

def getSessionList: Seq[SessionInfo]

181

def getSession(sessionId: String): Option[SessionInfo]

182

def getExecutionList: Seq[ExecutionInfo]

183

}

184

```

185

186

### Hive Integration Types

187

188

```scala { .api }

189

// From Hive Service API

190

import org.apache.hive.service.cli.SessionHandle

191

import org.apache.hive.service.cli.OperationHandle

192

import org.apache.hive.service.cli.thrift.TProtocolVersion

193

import org.apache.hive.service.server.HiveServer2

194

import org.apache.hadoop.hive.conf.HiveConf

195

```

196

197

## Configuration

198

199

### Transport Modes

200

201

- **Binary**: Default TCP transport using Thrift binary protocol

202

- **HTTP**: HTTP-based transport for firewall-friendly connections

203

204

### Authentication

205

206

- **Kerberos**: Enterprise authentication with keytab support

207

- **SPNEGO**: HTTP authentication for web-based access

208

- **Delegation Tokens**: Secure token-based authentication

209

210

### Key Configuration Properties

211

212

- `hive.server2.transport.mode`: "binary" or "http"

213

- `hive.server2.thrift.port`: Server port (default: 10000)

214

- `hive.server2.thrift.bind.host`: Bind address

215

- `spark.sql.hive.thriftServer.singleSession`: Share single session

216

- `spark.sql.thriftServer.incrementalCollect`: Incremental result collection