or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

application-handles.md configuration.md index.md launchers.md

tile.json

tessl/maven-org-apache-spark--spark-launcher_2-11

Library for launching Spark applications programmatically with monitoring and control capabilities.

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:maven/org.apache.spark/spark-launcher_2.11@2.4.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-launcher_2-11@2.4.0

Apache Spark Launcher

Apache Spark Launcher provides a programmatic API for launching and monitoring Spark applications from Java applications. It offers two primary launch modes: child process execution with full monitoring capabilities, and in-process execution for cluster deployments. The library handles Spark application lifecycle management including configuration, execution, state monitoring, and provides comprehensive control interfaces for running applications.

Package Information

Package Name: org.apache.spark:spark-launcher_2.11
Package Type: Maven (Java)
Language: Java

Installation: Add to Maven dependencies:

<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-launcher_2.11</artifactId>
  <version>2.4.8</version>
</dependency>

Core Imports

import org.apache.spark.launcher.SparkLauncher;
import org.apache.spark.launcher.InProcessLauncher;
import org.apache.spark.launcher.SparkAppHandle;

Basic Usage

Child Process Launch with Monitoring

import org.apache.spark.launcher.SparkLauncher;
import org.apache.spark.launcher.SparkAppHandle;

// Configure and launch Spark application as child process
SparkAppHandle handle = new SparkLauncher()
    .setAppResource("/path/to/my-app.jar")
    .setMainClass("com.example.MySparkApp")
    .setMaster("local[*]")
    .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
    .setConf(SparkLauncher.EXECUTOR_MEMORY, "1g")
    .setAppName("My Spark Application")
    .startApplication();

// Monitor application state
handle.addListener(new SparkAppHandle.Listener() {
    public void stateChanged(SparkAppHandle handle) {
        System.out.println("State: " + handle.getState());
        if (handle.getState().isFinal()) {
            System.out.println("Application finished with ID: " + handle.getAppId());
        }
    }
    
    public void infoChanged(SparkAppHandle handle) {
        System.out.println("Info updated for app: " + handle.getAppId());
    }
});

// Wait for completion or stop/kill if needed
if (handle.getState() == SparkAppHandle.State.RUNNING) {
    handle.stop(); // Graceful shutdown
    // handle.kill(); // Force kill if needed
}

Raw Process Launch

import org.apache.spark.launcher.SparkLauncher;

// Launch as raw process (manual management required)
Process sparkProcess = new SparkLauncher()
    .setAppResource("/path/to/my-app.jar")
    .setMainClass("com.example.MySparkApp")
    .setMaster("yarn")
    .setDeployMode("cluster")
    .setConf(SparkLauncher.DRIVER_MEMORY, "4g")
    .launch();

// Manual process management
int exitCode = sparkProcess.waitFor();
System.out.println("Spark application exited with code: " + exitCode);

In-Process Launch (Cluster Mode)

import org.apache.spark.launcher.InProcessLauncher;
import org.apache.spark.launcher.SparkAppHandle;

// Launch application in same JVM (cluster mode recommended)
SparkAppHandle handle = new InProcessLauncher()
    .setAppResource("/path/to/my-app.jar")
    .setMainClass("com.example.MySparkApp")
    .setMaster("yarn")
    .setDeployMode("cluster")
    .setConf("spark.sql.adaptive.enabled", "true")
    .startApplication();

Architecture

The Spark Launcher library is built around several key components:

Launcher Classes: SparkLauncher and InProcessLauncher provide fluent configuration APIs for different launch modes
Abstract Base: AbstractLauncher provides common configuration methods shared by both launcher implementations
Handle Interface: SparkAppHandle provides runtime application control and monitoring with state-based lifecycle management
State Management: Comprehensive state tracking through SparkAppHandle.State enum with final state detection
Event System: Listener-based callbacks for real-time application state and information updates
Configuration System: Extensive configuration options through constants and fluent methods
Process Management: Robust child process handling with output redirection and logging capabilities

Capabilities

Application Launchers

Primary interfaces for launching Spark applications with comprehensive configuration options. Supports both child process and in-process execution modes.

// Child process launcher with monitoring
public class SparkLauncher extends AbstractLauncher<SparkLauncher> {
    public SparkLauncher();
    public SparkLauncher(Map<String, String> env);
    public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners);
    public Process launch();
}

// In-process launcher (cluster mode recommended)
public class InProcessLauncher extends AbstractLauncher<InProcessLauncher> {
    public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners);
}

Application Launchers

Application Handles

Runtime control and monitoring interface for launched Spark applications. Provides state tracking, application control, and event notifications.

public interface SparkAppHandle {
    void addListener(Listener l);
    State getState();
    String getAppId();
    void stop();
    void kill();
    void disconnect();
    
    public enum State {
        UNKNOWN(false), CONNECTED(false), SUBMITTED(false), RUNNING(false),
        FINISHED(true), FAILED(true), KILLED(true), LOST(true);
        
        public boolean isFinal();
    }
    
    public interface Listener {
        void stateChanged(SparkAppHandle handle);
        void infoChanged(SparkAppHandle handle);
    }
}

Application Handles

Configuration Management

Comprehensive configuration system with predefined constants for common Spark settings and fluent configuration methods.

public abstract class AbstractLauncher<T extends AbstractLauncher<T>> {
    public T setPropertiesFile(String path);
    public T setConf(String key, String value);
    public T setAppName(String appName);
    public T setMaster(String master);
    public T setDeployMode(String mode);
    public T setAppResource(String resource);
    public T setMainClass(String mainClass);
    public T addJar(String jar);
    public T addFile(String file);
    public T addPyFile(String file);
    public T addAppArgs(String... args);
    public T setVerbose(boolean verbose);
}

// Configuration constants in SparkLauncher
public static final String DRIVER_MEMORY = "spark.driver.memory";
public static final String EXECUTOR_MEMORY = "spark.executor.memory";
public static final String EXECUTOR_CORES = "spark.executor.cores";
// ... additional constants

Configuration Management

Common Use Cases

Batch Job Orchestration

Use SparkLauncher with monitoring to manage batch processing pipelines, track job completion, and handle failures gracefully.

Interactive Application Management

Leverage SparkAppHandle state notifications to build interactive dashboards that display real-time Spark application status.

Cluster Resource Management

Deploy applications to YARN, Mesos, or Kubernetes clusters using cluster mode with proper resource allocation through configuration constants.

Development and Testing

Use local mode execution for development and testing with simplified configuration and immediate feedback.

Environment Requirements

Spark Installation: Child process launches require SPARK_HOME environment variable or explicit setSparkHome() configuration
Java Runtime: Custom JAVA_HOME can be set via setJavaHome() method
Classpath: In-process launches require Spark dependencies in application classpath
Cluster Integration: Supports YARN, Mesos, Kubernetes, and Standalone cluster managers
Platform Support: Cross-platform with Windows-specific command handling

Error Handling

The library provides multiple layers of error handling:

Configuration Validation: Parameter validation with descriptive error messages
Launch Failures: IOException handling for process creation failures
Runtime Monitoring: State-based error detection through SparkAppHandle.State.FAILED
Connection Issues: Timeout handling for launcher server communication
Process Management: Robust child process lifecycle management with cleanup

Version

Tile

Files

tessl/maven-org-apache-spark--spark-launcher_2-11

To install, run

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Apache Spark Launcher

Package Information

Core Imports

Basic Usage

Child Process Launch with Monitoring

Raw Process Launch

In-Process Launch (Cluster Mode)

Architecture

Capabilities

Application Launchers

Application Handles

Configuration Management

Common Use Cases

Batch Job Orchestration

Interactive Application Management

Cluster Resource Management

Development and Testing

Environment Requirements

Error Handling

index.mddocs/