0
# Logging Infrastructure
1
2
SLF4J-based logging trait providing consistent logging methods and formatting across all Spark components, with support for lazy evaluation and configurable log levels.
3
4
## Capabilities
5
6
### Logging Trait
7
8
Core logging functionality that can be mixed into any class to provide standardized logging methods with SLF4J integration.
9
10
```scala { .api }
11
/**
12
* Utility trait for classes that want to log data using SLF4J
13
* Provides standardized logging methods with lazy evaluation
14
*/
15
trait Logging {
16
17
/**
18
* Logs info level message with lazy evaluation
19
* @param msg - Message to log (evaluated only if info logging is enabled)
20
*/
21
protected def logInfo(msg: => String): Unit
22
23
/**
24
* Logs debug level message with lazy evaluation
25
* @param msg - Message to log (evaluated only if debug logging is enabled)
26
*/
27
protected def logDebug(msg: => String): Unit
28
29
/**
30
* Logs trace level message with lazy evaluation
31
* @param msg - Message to log (evaluated only if trace logging is enabled)
32
*/
33
protected def logTrace(msg: => String): Unit
34
35
/**
36
* Logs warning level message with lazy evaluation
37
* @param msg - Message to log (evaluated only if warn logging is enabled)
38
*/
39
protected def logWarning(msg: => String): Unit
40
41
/**
42
* Logs error level message with lazy evaluation
43
* @param msg - Message to log (evaluated only if error logging is enabled)
44
*/
45
protected def logError(msg: => String): Unit
46
47
/**
48
* Logs warning level message with exception
49
* @param msg - Message to log
50
* @param throwable - Exception to log with stack trace
51
*/
52
protected def logWarning(msg: => String, throwable: Throwable): Unit
53
54
/**
55
* Logs error level message with exception
56
* @param msg - Message to log
57
* @param throwable - Exception to log with stack trace
58
*/
59
protected def logError(msg: => String, throwable: Throwable): Unit
60
61
/**
62
* Checks if trace logging is enabled
63
* @return true if trace level logging is enabled
64
*/
65
protected def isTraceEnabled(): Boolean
66
67
/**
68
* Initializes logging if necessary
69
* @param isInterpreter - Whether running in interpreter mode
70
*/
71
protected def initializeLogIfNecessary(isInterpreter: Boolean): Unit
72
}
73
```
74
75
**Usage Examples:**
76
77
```scala
78
import org.apache.spark.internal.Logging
79
80
class MySparkComponent extends Logging {
81
82
def processData(data: List[String]): List[String] = {
83
logInfo(s"Starting to process ${data.length} items")
84
85
try {
86
val result = data.map(_.toUpperCase)
87
logDebug(s"Processed data: ${result.take(5)}")
88
result
89
} catch {
90
case ex: Exception =>
91
logError("Failed to process data", ex)
92
throw ex
93
}
94
}
95
96
def expensiveOperation(): Unit = {
97
// Lazy evaluation - only computed if debug is enabled
98
logDebug(s"Debug info: ${computeExpensiveDebugInfo()}")
99
100
if (isTraceEnabled()) {
101
logTrace("Detailed trace information")
102
}
103
}
104
105
private def computeExpensiveDebugInfo(): String = {
106
// This is only called if debug logging is enabled
107
Thread.sleep(100) // Simulate expensive computation
108
"Expensive debug computation result"
109
}
110
}
111
```
112
113
### Basic Logging Methods
114
115
Core logging methods for different severity levels with lazy message evaluation.
116
117
```scala { .api }
118
// Info level logging for general information
119
protected def logInfo(msg: => String): Unit
120
121
// Debug level logging for detailed debugging information
122
protected def logDebug(msg: => String): Unit
123
124
// Trace level logging for very detailed execution traces
125
protected def logTrace(msg: => String): Unit
126
127
// Warning level logging for potentially problematic situations
128
protected def logWarning(msg: => String): Unit
129
130
// Error level logging for error conditions
131
protected def logError(msg: => String): Unit
132
```
133
134
**Usage Examples:**
135
136
```scala
137
class DataProcessor extends Logging {
138
139
def run(): Unit = {
140
logInfo("DataProcessor starting up")
141
142
logDebug("Loading configuration from environment")
143
val config = loadConfig()
144
145
logTrace(s"Configuration details: $config")
146
147
if (config.isEmpty) {
148
logWarning("No configuration found, using defaults")
149
}
150
151
try {
152
process(config)
153
logInfo("DataProcessor completed successfully")
154
} catch {
155
case ex: Exception =>
156
logError("DataProcessor failed")
157
throw ex
158
}
159
}
160
}
161
```
162
163
### Exception Logging Methods
164
165
Specialized logging methods for capturing exceptions with stack traces.
166
167
```scala { .api }
168
/**
169
* Logs warning with exception details
170
* @param msg - Warning message
171
* @param throwable - Exception to include in log
172
*/
173
protected def logWarning(msg: => String, throwable: Throwable): Unit
174
175
/**
176
* Logs error with exception details
177
* @param msg - Error message
178
* @param throwable - Exception to include in log with full stack trace
179
*/
180
protected def logError(msg: => String, throwable: Throwable): Unit
181
```
182
183
**Usage Examples:**
184
185
```scala
186
class NetworkClient extends Logging {
187
188
def connect(): Unit = {
189
try {
190
establishConnection()
191
} catch {
192
case ex: java.net.ConnectException =>
193
logWarning("Connection failed, will retry", ex)
194
scheduleRetry()
195
196
case ex: java.io.IOException =>
197
logError("Fatal I/O error during connection", ex)
198
throw ex
199
}
200
}
201
202
def processRequest(request: String): String = {
203
try {
204
sendRequest(request)
205
} catch {
206
case ex: Exception =>
207
logError(s"Failed to process request: $request", ex)
208
throw new RuntimeException("Request processing failed", ex)
209
}
210
}
211
}
212
```
213
214
### Logging Control Methods
215
216
Methods for controlling logging behavior and checking log level configuration.
217
218
```scala { .api }
219
/**
220
* Checks if trace logging is enabled to avoid expensive trace operations
221
* @return true if trace level logging is enabled
222
*/
223
protected def isTraceEnabled(): Boolean
224
225
/**
226
* Initializes logging configuration if necessary
227
* @param isInterpreter - Whether running in Spark shell/interpreter mode
228
*/
229
protected def initializeLogIfNecessary(isInterpreter: Boolean): Unit
230
```
231
232
**Usage Examples:**
233
234
```scala
235
class PerformanceMonitor extends Logging {
236
237
def monitorOperation[T](operation: => T): T = {
238
val startTime = System.currentTimeMillis()
239
240
// Only do expensive trace logging if enabled
241
if (isTraceEnabled()) {
242
logTrace(s"Starting operation at $startTime")
243
logTrace(s"Thread: ${Thread.currentThread().getName}")
244
logTrace(s"Memory: ${Runtime.getRuntime.freeMemory()}")
245
}
246
247
try {
248
val result = operation
249
val duration = System.currentTimeMillis() - startTime
250
251
logInfo(s"Operation completed in ${duration}ms")
252
253
if (isTraceEnabled()) {
254
logTrace(s"Result type: ${result.getClass.getSimpleName}")
255
}
256
257
result
258
} catch {
259
case ex: Exception =>
260
val duration = System.currentTimeMillis() - startTime
261
logError(s"Operation failed after ${duration}ms", ex)
262
throw ex
263
}
264
}
265
266
def initialize(): Unit = {
267
// Initialize logging for interpreter if needed
268
initializeLogIfNecessary(isInterpreter = false)
269
logInfo("PerformanceMonitor initialized")
270
}
271
}
272
```
273
274
## Logging Best Practices
275
276
### Lazy Message Evaluation
277
278
Take advantage of lazy evaluation to avoid expensive string operations when logging is disabled:
279
280
```scala
281
class QueryEngine extends Logging {
282
283
def executeQuery(sql: String): DataFrame = {
284
// Good: Message only computed if debug is enabled
285
logDebug(s"Query plan: ${analyzeQuery(sql)}")
286
287
// Bad: analyzeQuery always called even if debug disabled
288
// logDebug("Query plan: " + analyzeQuery(sql))
289
290
// Good: Conditional expensive operations
291
if (isTraceEnabled()) {
292
logTrace(s"Detailed metrics: ${computeDetailedMetrics()}")
293
}
294
295
executeSQL(sql)
296
}
297
}
298
```
299
300
### Structured Logging
301
302
Use consistent log message formats for better parsing and monitoring:
303
304
```scala
305
class TaskManager extends Logging {
306
307
def submitTask(taskId: String, taskType: String): Unit = {
308
logInfo(s"TASK_SUBMIT task_id=$taskId task_type=$taskType")
309
310
try {
311
executeTask(taskId, taskType)
312
logInfo(s"TASK_COMPLETE task_id=$taskId duration=${getDuration()}ms")
313
} catch {
314
case ex: Exception =>
315
logError(s"TASK_FAILED task_id=$taskId error=${ex.getMessage}", ex)
316
throw ex
317
}
318
}
319
}
320
```
321
322
### Log Level Guidelines
323
324
Choose appropriate log levels for different types of messages:
325
326
```scala
327
class DataLoader extends Logging {
328
329
def loadData(path: String): Dataset[Row] = {
330
// Info: High-level operations users should know about
331
logInfo(s"Loading data from $path")
332
333
// Debug: Detailed information useful for debugging
334
logDebug(s"Using schema inference: ${shouldInferSchema}")
335
logDebug(s"Partitions: ${getPartitionCount(path)}")
336
337
// Trace: Very detailed execution information
338
if (isTraceEnabled()) {
339
logTrace(s"File list: ${listFiles(path).take(10)}")
340
logTrace(s"Memory usage: ${getMemoryUsage()}")
341
}
342
343
// Warning: Potential issues that don't prevent operation
344
if (isDeprecatedFormat(path)) {
345
logWarning(s"Using deprecated file format for $path")
346
}
347
348
// Error: Actual problems that prevent operation
349
if (!exists(path)) {
350
logError(s"Data path does not exist: $path")
351
throw new FileNotFoundException(path)
352
}
353
354
readData(path)
355
}
356
}
357
```
358
359
## Integration with SLF4J
360
361
The Logging trait integrates with SLF4J, allowing configuration through standard logging frameworks:
362
363
### Logback Configuration Example
364
365
```xml
366
<configuration>
367
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
368
<encoder>
369
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
370
</encoder>
371
</appender>
372
373
<!-- Spark-specific logging levels -->
374
<logger name="org.apache.spark" level="INFO"/>
375
<logger name="org.apache.spark.sql" level="DEBUG"/>
376
377
<root level="WARN">
378
<appender-ref ref="STDOUT"/>
379
</root>
380
</configuration>
381
```
382
383
### Log4j Configuration Example
384
385
```properties
386
# Root logger
387
log4j.rootLogger=WARN, console
388
389
# Console appender
390
log4j.appender.console=org.apache.log4j.ConsoleAppender
391
log4j.appender.console.layout=org.apache.log4j.PatternLayout
392
log4j.appender.console.layout.ConversionPattern=%d{HH:mm:ss} %-5p %c{1}:%L - %m%n
393
394
# Spark-specific levels
395
log4j.logger.org.apache.spark=INFO
396
log4j.logger.org.apache.spark.sql=DEBUG
397
```
398
399
## Type Definitions
400
401
```scala { .api }
402
// Core logging trait for SLF4J integration
403
trait Logging {
404
// Basic logging methods with lazy evaluation
405
protected def logInfo(msg: => String): Unit
406
protected def logDebug(msg: => String): Unit
407
protected def logTrace(msg: => String): Unit
408
protected def logWarning(msg: => String): Unit
409
protected def logError(msg: => String): Unit
410
411
// Exception logging methods
412
protected def logWarning(msg: => String, throwable: Throwable): Unit
413
protected def logError(msg: => String, throwable: Throwable): Unit
414
415
// Logging control methods
416
protected def isTraceEnabled(): Boolean
417
protected def initializeLogIfNecessary(isInterpreter: Boolean): Unit
418
}
419
```