or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

cli-interface.mdenvironment-management.mdindex.mdquery-operations.mdserver-management.mdsession-management.mdweb-ui-monitoring.md
tile.json

tessl/maven-org-apache-spark--spark-hive-thriftserver_2-11

Spark Project Hive Thrift Server - A Thrift server implementation that provides JDBC/ODBC access to Spark SQL

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-hive-thriftserver_2.11@2.4.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-hive-thriftserver_2-11@2.4.0

index.mddocs/

Spark Hive Thrift Server

Apache Spark Hive Thrift Server provides JDBC/ODBC access to Spark SQL through the HiveServer2 protocol, enabling remote clients to execute SQL queries against Spark clusters using standard database connectivity tools and BI applications.

Package Information

  • Package Name: spark-hive-thriftserver_2.11
  • Package Type: maven
  • Language: Scala
  • Artifact ID: org.apache.spark:spark-hive-thriftserver_2.11:2.4.8
  • Installation: Include as Maven dependency or part of Spark distribution

Core Imports

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
import org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
import org.apache.spark.sql.SQLContext

Basic Usage

Starting the Thrift Server Programmatically

import org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf

// Create Spark SQL context
val conf = new SparkConf().setAppName("ThriftServer")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)

// Start the thrift server
HiveThriftServer2.startWithContext(sqlContext)

Starting from Command Line

# Start Thrift Server
$SPARK_HOME/sbin/start-thriftserver.sh --master spark://master:7077

# Start CLI
$SPARK_HOME/bin/spark-sql

Architecture

The Spark Hive Thrift Server consists of several key components:

  • HiveThriftServer2: Main server entry point and lifecycle management
  • Service Layer: CLI service, session management, and operation handling
  • Transport Layer: HTTP and binary Thrift protocol support
  • Web UI: Monitoring interface for sessions and queries
  • Authentication: Kerberos and delegation token support

Capabilities

Server Management

Main entry points for starting and managing the Thrift Server with lifecycle control and configuration.

object HiveThriftServer2 {
  def startWithContext(sqlContext: SQLContext): Unit
  def main(args: Array[String]): Unit
  var uiTab: Option[ThriftServerTab]
  var listener: HiveThriftServer2Listener
}

Server Management

CLI Interface

Command-line interface for interactive SQL execution with Spark SQL integration.

object SparkSQLCLIDriver {
  def main(args: Array[String]): Unit
  def installSignalHandler(): Unit
}

CLI Interface

Session Management

Session lifecycle management with SQL context handling and client connection management.

class SparkSQLSessionManager(hiveServer: HiveServer2, sqlContext: SQLContext) extends SessionManager {
  def openSession(protocol: TProtocolVersion, username: String, passwd: String, 
                 ipAddress: String, sessionConf: java.util.Map[String, String], 
                 withImpersonation: Boolean, delegationToken: String): SessionHandle
  def closeSession(sessionHandle: SessionHandle): Unit
}

Session Management

Query Operations

SQL query execution operations with result handling and asynchronous processing support.

class SparkSQLOperationManager extends OperationManager {
  val sessionToActivePool: ConcurrentHashMap[SessionHandle, String]
  val sessionToContexts: ConcurrentHashMap[SessionHandle, SQLContext]  
  def newExecuteStatementOperation(parentSession: HiveSession, statement: String,
                                  confOverlay: JMap[String, String], async: Boolean): ExecuteStatementOperation
}

Query Operations

Web UI Monitoring

Web-based monitoring interface for active sessions, query execution, and server performance metrics.

class ThriftServerTab(sparkContext: SparkContext) extends SparkUITab {
  val name: String = "JDBC/ODBC Server"
  def detach(): Unit
}

Web UI Monitoring

Environment Management

Spark SQL environment initialization and cleanup with configuration management.

object SparkSQLEnv {
  var sqlContext: SQLContext
  var sparkContext: SparkContext
  def init(): Unit
  def stop(): Unit
}

Environment Management

Types

Core Types

// Session information tracking
class SessionInfo(sessionId: String, startTimestamp: Long, ip: String, userName: String) {
  var finishTimestamp: Long
  var totalExecution: Int
  def totalTime: Long
}

// Query execution tracking  
class ExecutionInfo(statement: String, sessionId: String, startTimestamp: Long, userName: String) {
  var finishTimestamp: Long
  var executePlan: String
  var detail: String
  var state: ExecutionState.Value
  val jobId: ArrayBuffer[String]
  var groupId: String
  def totalTime: Long
}

// Execution states
object ExecutionState extends Enumeration {
  val STARTED, COMPILED, FAILED, FINISHED = Value
  type ExecutionState = Value
}

// Server listener for events
class HiveThriftServer2Listener(server: HiveServer2, conf: SQLConf) extends SparkListener {
  def getOnlineSessionNum: Int
  def getTotalRunning: Int
  def getSessionList: Seq[SessionInfo]
  def getSession(sessionId: String): Option[SessionInfo]
  def getExecutionList: Seq[ExecutionInfo]
}

Hive Integration Types

// From Hive Service API
import org.apache.hive.service.cli.SessionHandle
import org.apache.hive.service.cli.OperationHandle  
import org.apache.hive.service.cli.thrift.TProtocolVersion
import org.apache.hive.service.server.HiveServer2
import org.apache.hadoop.hive.conf.HiveConf

Configuration

Transport Modes

  • Binary: Default TCP transport using Thrift binary protocol
  • HTTP: HTTP-based transport for firewall-friendly connections

Authentication

  • Kerberos: Enterprise authentication with keytab support
  • SPNEGO: HTTP authentication for web-based access
  • Delegation Tokens: Secure token-based authentication

Key Configuration Properties

  • hive.server2.transport.mode: "binary" or "http"
  • hive.server2.thrift.port: Server port (default: 10000)
  • hive.server2.thrift.bind.host: Bind address
  • spark.sql.hive.thriftServer.singleSession: Share single session
  • spark.sql.thriftServer.incrementalCollect: Incremental result collection