0
# Monitoring and UI
1
2
Web-based monitoring interface with session tracking and query statistics. Provides real-time visibility into thrift server operations through Spark's web UI.
3
4
## Capabilities
5
6
### Event Listener
7
8
Comprehensive event tracking for sessions and query execution with real-time statistics.
9
10
```scala { .api }
11
private[thriftserver] class HiveThriftServer2Listener(
12
server: HiveServer2,
13
conf: SQLConf
14
) extends SparkListener {
15
16
/**
17
* Get the current number of active sessions
18
* @return Number of currently connected sessions
19
*/
20
def getOnlineSessionNum: Int
21
22
/**
23
* Get the current number of running statements
24
* @return Number of statements currently executing
25
*/
26
def getTotalRunning: Int
27
28
/**
29
* Get information about all sessions (active and completed)
30
* @return Sequence of SessionInfo objects with session details
31
*/
32
def getSessionList: Seq[SessionInfo]
33
34
/**
35
* Get information about a specific session
36
* @param sessionId Unique session identifier
37
* @return Optional SessionInfo if session exists
38
*/
39
def getSession(sessionId: String): Option[SessionInfo]
40
41
/**
42
* Get information about all statement executions
43
* @return Sequence of ExecutionInfo objects with execution details
44
*/
45
def getExecutionList: Seq[ExecutionInfo]
46
47
/**
48
* Handle session creation events
49
* @param ip Client IP address
50
* @param sessionId Unique session identifier
51
* @param userName Connected user name (default: "UNKNOWN")
52
*/
53
def onSessionCreated(ip: String, sessionId: String, userName: String = "UNKNOWN"): Unit
54
55
/**
56
* Handle session closure events
57
* @param sessionId Session identifier being closed
58
*/
59
def onSessionClosed(sessionId: String): Unit
60
61
/**
62
* Handle statement execution start events
63
* @param id Unique statement execution identifier
64
* @param sessionId Parent session identifier
65
* @param statement SQL statement being executed
66
* @param groupId Spark job group identifier
67
* @param userName Executing user name (default: "UNKNOWN")
68
*/
69
def onStatementStart(
70
id: String,
71
sessionId: String,
72
statement: String,
73
groupId: String,
74
userName: String = "UNKNOWN"
75
): Unit
76
77
/**
78
* Handle statement parsing completion events
79
* @param id Statement execution identifier
80
* @param executionPlan Query execution plan text
81
*/
82
def onStatementParsed(id: String, executionPlan: String): Unit
83
84
/**
85
* Handle statement execution error events
86
* @param id Statement execution identifier
87
* @param errorMessage Error description
88
* @param errorTrace Full error stack trace
89
*/
90
def onStatementError(id: String, errorMessage: String, errorTrace: String): Unit
91
92
/**
93
* Handle successful statement completion events
94
* @param id Statement execution identifier
95
*/
96
def onStatementFinish(id: String): Unit
97
}
98
```
99
100
**Usage Example:**
101
102
```scala
103
// Access listener through server object
104
val listener = HiveThriftServer2.listener
105
106
// Get current server statistics
107
println(s"Active sessions: ${listener.getOnlineSessionNum}")
108
println(s"Running queries: ${listener.getTotalRunning}")
109
110
// List all sessions with details
111
listener.getSessionList.foreach { session =>
112
println(s"Session ${session.sessionId}:")
113
println(s" User: ${session.userName}")
114
println(s" IP: ${session.ip}")
115
println(s" Duration: ${session.totalTime}ms")
116
println(s" Queries executed: ${session.totalExecution}")
117
}
118
119
// List recent query executions
120
listener.getExecutionList.take(10).foreach { execution =>
121
println(s"Query: ${execution.statement}")
122
println(s" State: ${execution.state}")
123
println(s" Duration: ${execution.totalTime}ms")
124
println(s" User: ${execution.userName}")
125
}
126
```
127
128
### Web UI Tab
129
130
Integration with Spark's web UI providing dedicated monitoring interface.
131
132
```scala { .api }
133
private[thriftserver] class ThriftServerTab(
134
sparkContext: SparkContext
135
) extends SparkUITab {
136
137
/**
138
* Tab display name in Spark UI
139
*/
140
val name = "JDBC/ODBC Server"
141
142
/**
143
* Remove the tab from Spark UI
144
* Called during server shutdown
145
*/
146
def detach(): Unit
147
}
148
```
149
150
**Web UI Features:**
151
- **Server Overview**: Active sessions, running queries, server uptime
152
- **Session Details**: Per-session statistics, query history, resource usage
153
- **Query Execution**: Real-time query status, execution plans, performance metrics
154
- **Historical Data**: Completed sessions and queries with full details
155
156
### UI Pages
157
158
#### Server Overview Page
159
160
Main monitoring page showing server-wide statistics and active operations.
161
162
```scala { .api }
163
private[thriftserver] class ThriftServerPage(parent: ThriftServerTab) extends WebUIPage {
164
/**
165
* Render the main server overview page
166
* @param request HTTP request with optional filtering parameters
167
* @return HTML content for the server overview
168
*/
169
def render(request: HttpServletRequest): Seq[Node]
170
}
171
```
172
173
**Page Content:**
174
- Server status and uptime
175
- Connection statistics (total, active, failed)
176
- Query execution metrics (total, running, completed, failed)
177
- Resource utilization summary
178
- Recent activity timeline
179
180
#### Session Detail Page
181
182
Detailed view of individual session information and query history.
183
184
```scala { .api }
185
private[thriftserver] class ThriftServerSessionPage(parent: ThriftServerTab) extends WebUIPage {
186
/**
187
* Render detailed session information page
188
* @param request HTTP request with session ID parameter
189
* @return HTML content for session details
190
*/
191
def render(request: HttpServletRequest): Seq[Node]
192
}
193
```
194
195
**Page Content:**
196
- Session metadata (ID, user, IP, connection time)
197
- Configuration settings applied to session
198
- Complete query history with execution times
199
- Resource usage and performance metrics
200
- Active and completed operations
201
202
### Monitoring Data Models
203
204
#### Session Information
205
206
```scala { .api }
207
private[thriftserver] class SessionInfo(
208
val sessionId: String,
209
val startTimestamp: Long,
210
val ip: String,
211
val userName: String
212
) {
213
var finishTimestamp: Long = 0L
214
var totalExecution: Int = 0
215
216
/**
217
* Calculate total session duration
218
* @return Duration in milliseconds (ongoing sessions use current time)
219
*/
220
def totalTime: Long = {
221
if (finishTimestamp == 0L) {
222
System.currentTimeMillis - startTimestamp
223
} else {
224
finishTimestamp - startTimestamp
225
}
226
}
227
}
228
```
229
230
#### Execution Information
231
232
```scala { .api }
233
private[thriftserver] class ExecutionInfo(
234
val statement: String,
235
val sessionId: String,
236
val startTimestamp: Long,
237
val userName: String
238
) {
239
var finishTimestamp: Long = 0L
240
var executePlan: String = ""
241
var detail: String = ""
242
var state: ExecutionState.Value = ExecutionState.STARTED
243
val jobId: ArrayBuffer[String] = ArrayBuffer[String]()
244
var groupId: String = ""
245
246
/**
247
* Calculate execution duration
248
* @return Duration in milliseconds (running queries use current time)
249
*/
250
def totalTime: Long = {
251
if (finishTimestamp == 0L) {
252
System.currentTimeMillis - startTimestamp
253
} else {
254
finishTimestamp - startTimestamp
255
}
256
}
257
}
258
```
259
260
#### Execution States
261
262
```scala { .api }
263
private[thriftserver] object ExecutionState extends Enumeration {
264
val STARTED, COMPILED, FAILED, FINISHED = Value
265
type ExecutionState = Value
266
}
267
```
268
269
### Data Retention Policies
270
271
The monitoring system implements configurable data retention:
272
273
```scala
274
// Configurable limits for UI data retention
275
private val retainedStatements = conf.getConf(SQLConf.THRIFTSERVER_UI_STATEMENT_LIMIT)
276
private val retainedSessions = conf.getConf(SQLConf.THRIFTSERVER_UI_SESSION_LIMIT)
277
278
// Automatic cleanup of old data
279
private def trimExecutionIfNecessary() = {
280
if (executionList.size > retainedStatements) {
281
val toRemove = math.max(retainedStatements / 10, 1)
282
executionList.filter(_._2.finishTimestamp != 0).take(toRemove).foreach { s =>
283
executionList.remove(s._1)
284
}
285
}
286
}
287
288
private def trimSessionIfNecessary() = {
289
if (sessionList.size > retainedSessions) {
290
val toRemove = math.max(retainedSessions / 10, 1)
291
sessionList.filter(_._2.finishTimestamp != 0).take(toRemove).foreach { s =>
292
sessionList.remove(s._1)
293
}
294
}
295
}
296
```
297
298
### Integration with Spark Metrics
299
300
The monitoring system integrates with Spark's event system:
301
302
```scala
303
override def onJobStart(jobStart: SparkListenerJobStart): Unit = synchronized {
304
for {
305
props <- Option(jobStart.properties)
306
groupId <- Option(props.getProperty(SparkContext.SPARK_JOB_GROUP_ID))
307
(_, info) <- executionList if info.groupId == groupId
308
} {
309
info.jobId += jobStart.jobId.toString
310
info.groupId = groupId
311
}
312
}
313
314
override def onApplicationEnd(applicationEnd: SparkListenerApplicationEnd): Unit = {
315
server.stop()
316
}
317
```
318
319
### Performance Monitoring
320
321
The UI provides detailed performance insights:
322
323
**Query Performance:**
324
- Compilation time vs execution time
325
- Row processing rates
326
- Resource utilization per query
327
- Bottleneck identification
328
329
**Session Performance:**
330
- Average query duration per session
331
- Query success/failure rates
332
- Resource usage patterns
333
- Concurrent query handling
334
335
**Server Performance:**
336
- Overall throughput metrics
337
- Connection handling efficiency
338
- Memory and CPU utilization
339
- I/O performance characteristics
340
341
### REST API Access
342
343
While not explicitly exposed, the monitoring data can be accessed programmatically:
344
345
```scala
346
// Access monitoring data through the listener
347
val listener = HiveThriftServer2.listener
348
349
// Get real-time metrics
350
val metrics = Map(
351
"activeSessions" -> listener.getOnlineSessionNum,
352
"runningQueries" -> listener.getTotalRunning,
353
"totalSessions" -> listener.getSessionList.size,
354
"totalExecutions" -> listener.getExecutionList.size
355
)
356
357
// Convert to JSON for API responses
358
val jsonMetrics = org.json4s.jackson.Serialization.write(metrics)
359
```
360
361
### Configuration Options
362
363
Monitoring behavior can be customized through configuration:
364
365
```scala
366
// UI retention limits
367
spark.sql.thriftServer.ui.retainedSessions = 200
368
spark.sql.thriftServer.ui.retainedStatements = 1000
369
370
// Enable/disable UI components
371
spark.ui.enabled = true
372
spark.ui.port = 4040
373
374
// Monitoring detail level
375
spark.eventLog.enabled = true
376
spark.eventLog.dir = /var/log/spark-events
377
```
378
379
### Security Considerations
380
381
The monitoring UI respects Spark's security settings:
382
383
- **Authentication**: Inherits Spark UI authentication settings
384
- **Authorization**: Access control based on Spark ACLs
385
- **Data Privacy**: Sensitive query content can be masked
386
- **Network Security**: HTTPS support when configured in Spark UI