Library for launching Spark applications programmatically
npx @tessl/cli install tessl/maven-org-apache-spark--spark-launcher-2-10@1.6.0Apache Spark Launcher is a Java library that provides programmatic APIs for launching and controlling Apache Spark applications. It offers two main approaches: the SparkLauncher.startApplication() method which returns a SparkAppHandle for monitoring and controlling running applications, and the SparkLauncher.launch() method which starts a raw child process.
org.apache.spark:spark-launcher_2.10:1.6.3import org.apache.spark.launcher.SparkLauncher;
import org.apache.spark.launcher.SparkAppHandle;import org.apache.spark.launcher.SparkAppHandle;
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
// Launch with monitoring handle
SparkAppHandle handle = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.startApplication();
// Monitor application state
System.out.println("Application State: " + handle.getState());
System.out.println("Application ID: " + handle.getAppId());
}
}The Spark Launcher is built around two key components:
The SparkLauncher class uses a builder pattern to configure Spark applications and provides two launch modes: monitored applications via startApplication() and raw process launch via launch().
/**
* Launcher for Spark applications using builder pattern
*/
public class SparkLauncher {
// Configuration constants
public static final String SPARK_MASTER = "spark.master";
public static final String DRIVER_MEMORY = "spark.driver.memory";
public static final String DRIVER_EXTRA_CLASSPATH = "spark.driver.extraClassPath";
public static final String DRIVER_EXTRA_JAVA_OPTIONS = "spark.driver.extraJavaOptions";
public static final String DRIVER_EXTRA_LIBRARY_PATH = "spark.driver.extraLibraryPath";
public static final String EXECUTOR_MEMORY = "spark.executor.memory";
public static final String EXECUTOR_EXTRA_CLASSPATH = "spark.executor.extraClassPath";
public static final String EXECUTOR_EXTRA_JAVA_OPTIONS = "spark.executor.extraJavaOptions";
public static final String EXECUTOR_EXTRA_LIBRARY_PATH = "spark.executor.extraLibraryPath";
public static final String EXECUTOR_CORES = "spark.executor.cores";
public static final String CHILD_PROCESS_LOGGER_NAME = "spark.launcher.childProcLoggerName";
public static final String CHILD_CONNECTION_TIMEOUT = "spark.launcher.childConectionTimeout";
// Static configuration
public static void setConfig(String name, String value);
// Constructors
public SparkLauncher();
public SparkLauncher(Map<String, String> env);
// Environment configuration
public SparkLauncher setJavaHome(String javaHome);
public SparkLauncher setSparkHome(String sparkHome);
public SparkLauncher setPropertiesFile(String path);
// Application configuration
public SparkLauncher setConf(String key, String value);
public SparkLauncher setAppName(String appName);
public SparkLauncher setMaster(String master);
public SparkLauncher setDeployMode(String mode);
public SparkLauncher setAppResource(String resource);
public SparkLauncher setMainClass(String mainClass);
// Arguments and options
public SparkLauncher addSparkArg(String arg);
public SparkLauncher addSparkArg(String name, String value);
public SparkLauncher addAppArgs(String... args);
public SparkLauncher setVerbose(boolean verbose);
// Resource files
public SparkLauncher addJar(String jar);
public SparkLauncher addFile(String file);
public SparkLauncher addPyFile(String file);
// Launch methods
public Process launch() throws IOException;
public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) throws IOException;
}Usage Examples:
// Basic configuration and launch
SparkLauncher launcher = new SparkLauncher()
.setAppResource("/path/to/my-app.jar")
.setMainClass("com.example.MySparkApp")
.setMaster("yarn")
.setDeployMode("cluster")
.setConf(SparkLauncher.DRIVER_MEMORY, "4g")
.setConf(SparkLauncher.EXECUTOR_MEMORY, "2g")
.setConf(SparkLauncher.EXECUTOR_CORES, "2");
// Launch with monitoring
SparkAppHandle handle = launcher.startApplication();
// Launch as raw process
Process process = launcher.launch();The SparkAppHandle interface provides runtime information and control capabilities for running Spark applications.
/**
* Handle to a running Spark application providing monitoring and control
*/
public interface SparkAppHandle {
// State monitoring
void addListener(Listener l);
State getState();
String getAppId();
// Application control
void stop();
void kill();
void disconnect();
/**
* Application state enumeration
*/
public enum State {
UNKNOWN(false), // Application has not reported back yet
CONNECTED(false), // Application has connected to the handle
SUBMITTED(false), // Application has been submitted to the cluster
RUNNING(false), // Application is running
FINISHED(true), // Application finished with successful status
FAILED(true), // Application finished with failed status
KILLED(true); // Application was killed
public boolean isFinal();
}
/**
* Listener interface for handle state and information changes
*/
public interface Listener {
void stateChanged(SparkAppHandle handle);
void infoChanged(SparkAppHandle handle);
}
}Usage Examples:
// Application monitoring with listeners
SparkAppHandle handle = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.startApplication(new SparkAppHandle.Listener() {
@Override
public void stateChanged(SparkAppHandle handle) {
System.out.println("State changed to: " + handle.getState());
if (handle.getState().isFinal()) {
System.out.println("Application finished");
}
}
@Override
public void infoChanged(SparkAppHandle handle) {
System.out.println("App ID: " + handle.getAppId());
}
});
// Application control - graceful stop
handle.stop();
// Application control - force kill
handle.kill();
// Disconnect without stopping
handle.disconnect();The launcher methods throw IOException when process creation fails or when there are issues with application startup. Applications may enter FAILED state which can be monitored through the SparkAppHandle.State enum.
try {
SparkAppHandle handle = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.startApplication();
} catch (IOException e) {
System.err.println("Failed to start Spark application: " + e.getMessage());
}