0
# Application Master
1
2
ApplicationMaster functionality for managing Spark applications running on YARN. The ApplicationMaster serves as the central coordinator for Spark applications, handling resource negotiation with the YARN ResourceManager and managing executor lifecycle.
3
4
## Capabilities
5
6
### ApplicationMaster Class
7
8
Core ApplicationMaster implementation that manages the Spark application lifecycle on YARN clusters.
9
10
```scala { .api }
11
/**
12
* ApplicationMaster for managing Spark applications on YARN
13
* Handles resource negotiation with ResourceManager and executor management
14
*/
15
private[spark] class ApplicationMaster(
16
args: ApplicationMasterArguments,
17
client: YarnRMClient
18
) {
19
// Application lifecycle management
20
// Resource negotiation with YARN ResourceManager
21
// Executor management and monitoring
22
// Integration with Spark driver (cluster mode) or client (client mode)
23
}
24
```
25
26
**Usage Examples:**
27
28
```scala
29
import org.apache.spark.deploy.yarn.{ApplicationMaster, ApplicationMasterArguments}
30
31
// ApplicationMaster is typically instantiated by YARN runtime
32
val args = new ApplicationMasterArguments(Array("--class", "MyMainClass"))
33
val rmClient = // YarnRMClient implementation
34
val appMaster = new ApplicationMaster(args, rmClient)
35
36
// ApplicationMaster lifecycle is managed by YARN container runtime
37
```
38
39
### ApplicationMasterArguments
40
41
Argument parsing and configuration for ApplicationMaster operations.
42
43
```scala { .api }
44
/**
45
* Argument parsing for ApplicationMaster
46
* Handles command-line arguments passed to the ApplicationMaster container
47
*/
48
class ApplicationMasterArguments(val args: Array[String]) {
49
/** User application JAR file */
50
var userJar: String = null
51
52
/** User application main class */
53
var userClass: String = null
54
55
/** Arguments to pass to user application */
56
var userArgs: Seq[String] = Seq[String]()
57
58
/** Executor memory in MB (default: 1024) */
59
var executorMemory: Int = 1024
60
61
/** Number of cores per executor (default: 1) */
62
var executorCores: Int = 1
63
64
/** Total number of executors to request (default: DEFAULT_NUMBER_EXECUTORS) */
65
var numExecutors: Int = DEFAULT_NUMBER_EXECUTORS
66
67
/**
68
* Print usage information and exit with specified exit code
69
* @param exitCode Exit code to use
70
* @param unknownParam Optional unknown parameter that caused the error
71
*/
72
def printUsageAndExit(exitCode: Int, unknownParam: Any = null): Unit
73
}
74
75
/**
76
* Companion object for ApplicationMasterArguments
77
*/
78
object ApplicationMasterArguments {
79
val DEFAULT_NUMBER_EXECUTORS = 2
80
}
81
```
82
83
**Usage Examples:**
84
85
```scala
86
import org.apache.spark.deploy.yarn.ApplicationMasterArguments
87
88
// Parse ApplicationMaster arguments (typically from YARN container launch)
89
val args = Array(
90
"--class", "com.example.SparkApp",
91
"--jar", "/path/to/app.jar",
92
"--executor-memory", "2g",
93
"--executor-cores", "2"
94
)
95
96
val amArgs = new ApplicationMasterArguments(args)
97
```
98
99
### Main Entry Points
100
101
Command-line entry points for ApplicationMaster and ExecutorLauncher operations.
102
103
```scala { .api }
104
/**
105
* Main entry point for ApplicationMaster
106
* Invoked by YARN when starting the ApplicationMaster container
107
*/
108
object ApplicationMaster {
109
def main(args: Array[String]): Unit
110
}
111
112
/**
113
* Entry point for executor launcher functionality
114
* Used in client mode when ApplicationMaster only manages executors
115
*/
116
object ExecutorLauncher {
117
def main(args: Array[String]): Unit
118
}
119
```
120
121
## ApplicationMaster Responsibilities
122
123
### Resource Management
124
125
The ApplicationMaster negotiates with the YARN ResourceManager to:
126
127
- Request executor containers based on application requirements
128
- Monitor executor health and performance
129
- Handle executor failures and replacement
130
- Release unused resources back to the cluster
131
132
### Application Coordination
133
134
In **cluster mode**, the ApplicationMaster:
135
- Hosts the Spark driver (SparkContext)
136
- Manages the complete application lifecycle
137
- Handles application completion and cleanup
138
139
In **client mode**, the ApplicationMaster:
140
- Acts as an ExecutorLauncher
141
- Manages only executor containers
142
- Communicates with the remote Spark driver
143
144
### Communication Patterns
145
146
```scala
147
// Example of ApplicationMaster integration patterns
148
// (Internal implementation details - shown for understanding)
149
150
class ApplicationMaster(args: ApplicationMasterArguments, client: YarnRMClient) {
151
// Resource negotiation loop
152
private def allocateExecutors(): Unit = {
153
// Request containers from ResourceManager
154
// Launch executor processes in allocated containers
155
// Monitor executor health and handle failures
156
}
157
158
// Driver integration (cluster mode)
159
private def runDriver(): Unit = {
160
// Start Spark driver within ApplicationMaster JVM
161
// Handle driver completion and cleanup
162
}
163
164
// Client communication (client mode)
165
private def connectToDriver(): Unit = {
166
// Establish connection to remote Spark driver
167
// Report executor status and handle commands
168
}
169
}
170
```
171
172
## Configuration Integration
173
174
### Spark Configuration
175
176
The ApplicationMaster integrates with Spark configuration for:
177
178
```scala
179
// Key configuration properties used by ApplicationMaster
180
spark.yarn.am.memory // ApplicationMaster memory
181
spark.yarn.am.cores // ApplicationMaster CPU cores
182
spark.yarn.am.waitTime // Max wait time for SparkContext
183
spark.yarn.containerLauncherMaxThreads // Executor launch parallelism
184
spark.yarn.executor.memoryFraction // Executor memory allocation
185
```
186
187
### YARN Configuration
188
189
Integration with Hadoop YARN configuration:
190
191
```scala
192
// YARN properties affecting ApplicationMaster behavior
193
yarn.app.mapreduce.am.resource.mb // Memory resource limits
194
yarn.app.mapreduce.am.command-opts // JVM options
195
yarn.nodemanager.aux-services // Required auxiliary services
196
```
197
198
## Deployment Modes
199
200
### Cluster Mode
201
202
```scala
203
// In cluster mode, ApplicationMaster hosts the driver
204
object ApplicationMaster {
205
def main(args: Array[String]): Unit = {
206
// Parse arguments
207
val amArgs = new ApplicationMasterArguments(args)
208
209
// Create ApplicationMaster instance
210
val appMaster = new ApplicationMaster(amArgs, yarnRMClient)
211
212
// Run driver within ApplicationMaster
213
// Handle application completion
214
}
215
}
216
```
217
218
### Client Mode
219
220
```scala
221
// In client mode, ApplicationMaster manages only executors
222
object ExecutorLauncher {
223
def main(args: Array[String]): Unit = {
224
// ApplicationMaster acts as executor launcher
225
// Connects to remote driver
226
// Manages executor containers only
227
}
228
}
229
```
230
231
## Error Handling
232
233
The ApplicationMaster handles various failure scenarios:
234
235
- **Driver failures**: Application termination and cleanup
236
- **Executor failures**: Container replacement and task rescheduling
237
- **ResourceManager failures**: Reconnection and state recovery
238
- **Network partitions**: Timeout handling and graceful degradation
239
240
## Monitoring and Metrics
241
242
ApplicationMaster provides monitoring capabilities:
243
244
- Executor resource utilization tracking
245
- Application progress reporting to YARN
246
- Integration with Spark metrics system
247
- Health check and heartbeat mechanisms