0
# Session and Configuration
1
2
Core session management and configuration for enabling Hive support in Spark SQL sessions.
3
4
## Core Imports
5
6
```scala
7
import org.apache.spark.sql.SparkSession
8
import org.apache.spark.sql.hive.HiveSessionStateBuilder
9
import org.apache.spark.sql.hive.HiveSessionCatalog
10
import org.apache.spark.sql.hive.HiveUtils
11
import org.apache.spark.sql.hive.HiveContext
12
```
13
14
## Capabilities
15
16
### Enable Hive Support
17
18
The primary method to enable Hive integration in Spark SQL sessions.
19
20
```scala { .api }
21
/**
22
* Enable Hive support in SparkSession, allowing access to Hive tables and metastore
23
* @returns Builder instance with Hive support enabled
24
*/
25
def enableHiveSupport(): Builder
26
```
27
28
**Usage Example:**
29
30
```scala
31
import org.apache.spark.sql.SparkSession
32
33
val spark = SparkSession.builder()
34
.appName("Hive Integration")
35
.enableHiveSupport()
36
.getOrCreate()
37
38
// Now you can access Hive tables
39
spark.sql("SHOW TABLES").show()
40
```
41
42
### Configuration Options
43
44
Key configuration options for Hive integration behavior.
45
46
```scala { .api }
47
/**
48
* Set configuration property for Hive integration
49
* @param key Configuration key
50
* @param value Configuration value
51
* @returns Builder instance with configuration set
52
*/
53
def config(key: String, value: String): Builder
54
```
55
56
**Key Configuration Properties:**
57
58
```scala
59
// Use Hive as catalog implementation
60
spark.config("spark.sql.catalogImplementation", "hive")
61
62
// Convert Parquet tables from Hive metastore
63
spark.config("spark.sql.hive.convertMetastoreParquet", "true")
64
65
// Convert ORC tables from Hive metastore
66
spark.config("spark.sql.hive.convertMetastoreOrc", "true")
67
68
// Enable conversion for partitioned table inserts
69
spark.config("spark.sql.hive.convertInsertingPartitionedTable", "true")
70
71
// Hive metastore warehouse directory
72
spark.config("spark.sql.warehouse.dir", "/user/hive/warehouse")
73
```
74
75
### Legacy HiveContext (Deprecated)
76
77
**Note:** This API is deprecated since Spark 2.0.0. Use `SparkSession.builder().enableHiveSupport()` instead.
78
79
```scala { .api }
80
@deprecated("Use SparkSession.builder.enableHiveSupport instead", "2.0.0")
81
class HiveContext private[hive](sparkSession: SparkSession) extends SQLContext(sparkSession) {
82
/**
83
* Create new HiveContext instance
84
* @param sc SparkContext instance
85
*/
86
def this(sc: SparkContext)
87
88
/**
89
* Invalidate and refresh cached metadata for the given table
90
* @param tableName Name of table to refresh
91
*/
92
def refreshTable(tableName: String): Unit
93
94
/**
95
* Create new HiveContext session with separated SQLConf and temporary tables
96
* @returns New HiveContext instance
97
*/
98
override def newSession(): HiveContext
99
}
100
```
101
102
## Configuration Constants
103
104
```scala { .api }
105
object HiveUtils {
106
/** Convert metastore Parquet tables to use Spark's native Parquet reader */
107
val CONVERT_METASTORE_PARQUET: ConfigEntry[Boolean]
108
109
/** Convert metastore ORC tables to use Spark's native ORC reader */
110
val CONVERT_METASTORE_ORC: ConfigEntry[Boolean]
111
112
/** Convert partitioned table inserts to use data source API */
113
val CONVERT_INSERTING_PARTITIONED_TABLE: ConfigEntry[Boolean]
114
115
/** Convert CREATE TABLE AS SELECT to use data source API */
116
val CONVERT_METASTORE_CTAS: ConfigEntry[Boolean]
117
118
/** Convert INSERT DIRECTORY to use data source API */
119
val CONVERT_METASTORE_INSERT_DIR: ConfigEntry[Boolean]
120
}
121
```
122
123
## Session State Components
124
125
### HiveSessionStateBuilder
126
127
Internal builder for creating Hive-enabled session state.
128
129
```scala { .api }
130
class HiveSessionStateBuilder(
131
session: SparkSession,
132
parentState: Option[SessionState]
133
) extends BaseSessionStateBuilder(session, parentState) {
134
135
/** Create external catalog with Hive metastore support */
136
override protected def externalCatalog: ExternalCatalog
137
138
/** Create session catalog with Hive integration */
139
override protected def catalog: SessionCatalog
140
141
/** Create analyzer with Hive-specific rules */
142
override protected def analyzer: Analyzer
143
}
144
```
145
146
### HiveSessionCatalog
147
148
Session catalog with Hive metastore integration.
149
150
```scala { .api }
151
private[sql] class HiveSessionCatalog(
152
externalCatalogBuilder: () => ExternalCatalog,
153
globalTempViewManagerBuilder: () => GlobalTempViewManager,
154
val metastoreCatalog: HiveMetastoreCatalog,
155
functionRegistry: FunctionRegistry,
156
tableFunctionRegistry: TableFunctionRegistry,
157
hadoopConf: Configuration,
158
parser: ParserInterface,
159
functionResourceLoader: FunctionResourceLoader,
160
functionExpressionBuilder: FunctionExpressionBuilder
161
) extends SessionCatalog
162
```
163
164
## Error Handling
165
166
Common configuration-related exceptions:
167
168
- **AnalysisException**: Thrown when Hive metastore cannot be accessed or tables cannot be found
169
- **IllegalStateException**: Thrown when Hive support is required but not enabled
170
- **UnsupportedOperationException**: Thrown for operations not supported with Hive integration
171
172
**Example Error Handling:**
173
174
```scala
175
import org.apache.spark.sql.AnalysisException
176
177
try {
178
val df = spark.sql("SELECT * FROM non_existent_table")
179
} catch {
180
case e: AnalysisException =>
181
println(s"Table not found: ${e.getMessage}")
182
}
183
```