CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-apache-spark--yarn-parent-2-10

YARN integration support for Apache Spark cluster computing, enabling Spark applications to run on Hadoop YARN clusters

Pending
Overview
Eval results
Files

application-master.mddocs/

Application Master

ApplicationMaster functionality for managing Spark applications running on YARN. The ApplicationMaster serves as the central coordinator for Spark applications, handling resource negotiation with the YARN ResourceManager and managing executor lifecycle.

Capabilities

ApplicationMaster Class

Core ApplicationMaster implementation that manages the Spark application lifecycle on YARN clusters.

/**
 * ApplicationMaster for managing Spark applications on YARN
 * Handles resource negotiation with ResourceManager and executor management
 */
private[spark] class ApplicationMaster(
  args: ApplicationMasterArguments, 
  client: YarnRMClient
) {
  // Application lifecycle management
  // Resource negotiation with YARN ResourceManager
  // Executor management and monitoring
  // Integration with Spark driver (cluster mode) or client (client mode)
}

Usage Examples:

import org.apache.spark.deploy.yarn.{ApplicationMaster, ApplicationMasterArguments}

// ApplicationMaster is typically instantiated by YARN runtime
val args = new ApplicationMasterArguments(Array("--class", "MyMainClass"))
val rmClient = // YarnRMClient implementation
val appMaster = new ApplicationMaster(args, rmClient)

// ApplicationMaster lifecycle is managed by YARN container runtime

ApplicationMasterArguments

Argument parsing and configuration for ApplicationMaster operations.

/**
 * Argument parsing for ApplicationMaster
 * Handles command-line arguments passed to the ApplicationMaster container
 */
class ApplicationMasterArguments(val args: Array[String]) {
  /** User application JAR file */
  var userJar: String = null
  
  /** User application main class */
  var userClass: String = null
  
  /** Arguments to pass to user application */
  var userArgs: Seq[String] = Seq[String]()
  
  /** Executor memory in MB (default: 1024) */
  var executorMemory: Int = 1024
  
  /** Number of cores per executor (default: 1) */
  var executorCores: Int = 1
  
  /** Total number of executors to request (default: DEFAULT_NUMBER_EXECUTORS) */
  var numExecutors: Int = DEFAULT_NUMBER_EXECUTORS
  
  /**
   * Print usage information and exit with specified exit code
   * @param exitCode Exit code to use
   * @param unknownParam Optional unknown parameter that caused the error
   */
  def printUsageAndExit(exitCode: Int, unknownParam: Any = null): Unit
}

/**
 * Companion object for ApplicationMasterArguments
 */
object ApplicationMasterArguments {
  val DEFAULT_NUMBER_EXECUTORS = 2
}

Usage Examples:

import org.apache.spark.deploy.yarn.ApplicationMasterArguments

// Parse ApplicationMaster arguments (typically from YARN container launch)
val args = Array(
  "--class", "com.example.SparkApp",
  "--jar", "/path/to/app.jar",
  "--executor-memory", "2g",
  "--executor-cores", "2"
)

val amArgs = new ApplicationMasterArguments(args)

Main Entry Points

Command-line entry points for ApplicationMaster and ExecutorLauncher operations.

/**
 * Main entry point for ApplicationMaster
 * Invoked by YARN when starting the ApplicationMaster container
 */
object ApplicationMaster {
  def main(args: Array[String]): Unit
}

/**
 * Entry point for executor launcher functionality  
 * Used in client mode when ApplicationMaster only manages executors
 */
object ExecutorLauncher {
  def main(args: Array[String]): Unit
}

ApplicationMaster Responsibilities

Resource Management

The ApplicationMaster negotiates with the YARN ResourceManager to:

  • Request executor containers based on application requirements
  • Monitor executor health and performance
  • Handle executor failures and replacement
  • Release unused resources back to the cluster

Application Coordination

In cluster mode, the ApplicationMaster:

  • Hosts the Spark driver (SparkContext)
  • Manages the complete application lifecycle
  • Handles application completion and cleanup

In client mode, the ApplicationMaster:

  • Acts as an ExecutorLauncher
  • Manages only executor containers
  • Communicates with the remote Spark driver

Communication Patterns

// Example of ApplicationMaster integration patterns
// (Internal implementation details - shown for understanding)

class ApplicationMaster(args: ApplicationMasterArguments, client: YarnRMClient) {
  // Resource negotiation loop
  private def allocateExecutors(): Unit = {
    // Request containers from ResourceManager
    // Launch executor processes in allocated containers
    // Monitor executor health and handle failures
  }
  
  // Driver integration (cluster mode)
  private def runDriver(): Unit = {
    // Start Spark driver within ApplicationMaster JVM
    // Handle driver completion and cleanup
  }
  
  // Client communication (client mode)  
  private def connectToDriver(): Unit = {
    // Establish connection to remote Spark driver
    // Report executor status and handle commands
  }
}

Configuration Integration

Spark Configuration

The ApplicationMaster integrates with Spark configuration for:

// Key configuration properties used by ApplicationMaster
spark.yarn.am.memory          // ApplicationMaster memory
spark.yarn.am.cores           // ApplicationMaster CPU cores  
spark.yarn.am.waitTime        // Max wait time for SparkContext
spark.yarn.containerLauncherMaxThreads  // Executor launch parallelism
spark.yarn.executor.memoryFraction      // Executor memory allocation

YARN Configuration

Integration with Hadoop YARN configuration:

// YARN properties affecting ApplicationMaster behavior
yarn.app.mapreduce.am.resource.mb     // Memory resource limits
yarn.app.mapreduce.am.command-opts    // JVM options
yarn.nodemanager.aux-services         // Required auxiliary services

Deployment Modes

Cluster Mode

// In cluster mode, ApplicationMaster hosts the driver
object ApplicationMaster {
  def main(args: Array[String]): Unit = {
    // Parse arguments
    val amArgs = new ApplicationMasterArguments(args)
    
    // Create ApplicationMaster instance
    val appMaster = new ApplicationMaster(amArgs, yarnRMClient)
    
    // Run driver within ApplicationMaster
    // Handle application completion
  }
}

Client Mode

// In client mode, ApplicationMaster manages only executors
object ExecutorLauncher {
  def main(args: Array[String]): Unit = {
    // ApplicationMaster acts as executor launcher
    // Connects to remote driver
    // Manages executor containers only
  }
}

Error Handling

The ApplicationMaster handles various failure scenarios:

  • Driver failures: Application termination and cleanup
  • Executor failures: Container replacement and task rescheduling
  • ResourceManager failures: Reconnection and state recovery
  • Network partitions: Timeout handling and graceful degradation

Monitoring and Metrics

ApplicationMaster provides monitoring capabilities:

  • Executor resource utilization tracking
  • Application progress reporting to YARN
  • Integration with Spark metrics system
  • Health check and heartbeat mechanisms

Install with Tessl CLI

npx tessl i tessl/maven-org-apache-spark--yarn-parent-2-10

docs

application-master.md

index.md

resource-management.md

scheduler-backends.md

utilities.md

yarn-client.md

tile.json