Tessl Tile for maven/org.apache.spark/spark-launcher_2.11@2.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

application-handles.md configuration.md index.md launchers.md

index.mddocs/

0
# Apache Spark Launcher
1

2
Apache Spark Launcher provides a programmatic API for launching and monitoring Spark applications from Java applications. It offers two primary launch modes: child process execution with full monitoring capabilities, and in-process execution for cluster deployments. The library handles Spark application lifecycle management including configuration, execution, state monitoring, and provides comprehensive control interfaces for running applications.
3

4
## Package Information
5

6
- **Package Name**: org.apache.spark:spark-launcher_2.11
7
- **Package Type**: Maven (Java)
8
- **Language**: Java
9
- **Installation**: Add to Maven dependencies:
10
  ```xml
11
  <dependency>
12
    <groupId>org.apache.spark</groupId>
13
    <artifactId>spark-launcher_2.11</artifactId>
14
    <version>2.4.8</version>
15
  </dependency>
16
  ```
17

18
## Core Imports
19

20
```java
21
import org.apache.spark.launcher.SparkLauncher;
22
import org.apache.spark.launcher.InProcessLauncher;
23
import org.apache.spark.launcher.SparkAppHandle;
24
```
25

26
## Basic Usage
27

28
### Child Process Launch with Monitoring
29

30
```java
31
import org.apache.spark.launcher.SparkLauncher;
32
import org.apache.spark.launcher.SparkAppHandle;
33

34
// Configure and launch Spark application as child process
35
SparkAppHandle handle = new SparkLauncher()
36
    .setAppResource("/path/to/my-app.jar")
37
    .setMainClass("com.example.MySparkApp")
38
    .setMaster("local[*]")
39
    .setConf(SparkLauncher.DRIVER_MEMORY, "2g")
40
    .setConf(SparkLauncher.EXECUTOR_MEMORY, "1g")
41
    .setAppName("My Spark Application")
42
    .startApplication();
43

44
// Monitor application state
45
handle.addListener(new SparkAppHandle.Listener() {
46
    public void stateChanged(SparkAppHandle handle) {
47
        System.out.println("State: " + handle.getState());
48
        if (handle.getState().isFinal()) {
49
            System.out.println("Application finished with ID: " + handle.getAppId());
50
        }
51
    }
52
    
53
    public void infoChanged(SparkAppHandle handle) {
54
        System.out.println("Info updated for app: " + handle.getAppId());
55
    }
56
});
57

58
// Wait for completion or stop/kill if needed
59
if (handle.getState() == SparkAppHandle.State.RUNNING) {
60
    handle.stop(); // Graceful shutdown
61
    // handle.kill(); // Force kill if needed
62
}
63
```
64

65
### Raw Process Launch
66

67
```java
68
import org.apache.spark.launcher.SparkLauncher;
69

70
// Launch as raw process (manual management required)
71
Process sparkProcess = new SparkLauncher()
72
    .setAppResource("/path/to/my-app.jar")
73
    .setMainClass("com.example.MySparkApp")
74
    .setMaster("yarn")
75
    .setDeployMode("cluster")
76
    .setConf(SparkLauncher.DRIVER_MEMORY, "4g")
77
    .launch();
78

79
// Manual process management
80
int exitCode = sparkProcess.waitFor();
81
System.out.println("Spark application exited with code: " + exitCode);
82
```
83

84
### In-Process Launch (Cluster Mode)
85

86
```java
87
import org.apache.spark.launcher.InProcessLauncher;
88
import org.apache.spark.launcher.SparkAppHandle;
89

90
// Launch application in same JVM (cluster mode recommended)
91
SparkAppHandle handle = new InProcessLauncher()
92
    .setAppResource("/path/to/my-app.jar")
93
    .setMainClass("com.example.MySparkApp")
94
    .setMaster("yarn")
95
    .setDeployMode("cluster")
96
    .setConf("spark.sql.adaptive.enabled", "true")
97
    .startApplication();
98
```
99

100
## Architecture
101

102
The Spark Launcher library is built around several key components:
103

104
- **Launcher Classes**: `SparkLauncher` and `InProcessLauncher` provide fluent configuration APIs for different launch modes
105
- **Abstract Base**: `AbstractLauncher` provides common configuration methods shared by both launcher implementations
106
- **Handle Interface**: `SparkAppHandle` provides runtime application control and monitoring with state-based lifecycle management
107
- **State Management**: Comprehensive state tracking through `SparkAppHandle.State` enum with final state detection
108
- **Event System**: Listener-based callbacks for real-time application state and information updates
109
- **Configuration System**: Extensive configuration options through constants and fluent methods
110
- **Process Management**: Robust child process handling with output redirection and logging capabilities
111

112
## Capabilities
113

114
### Application Launchers
115

116
Primary interfaces for launching Spark applications with comprehensive configuration options. Supports both child process and in-process execution modes.
117

118
```java { .api }
119
// Child process launcher with monitoring
120
public class SparkLauncher extends AbstractLauncher<SparkLauncher> {
121
    public SparkLauncher();
122
    public SparkLauncher(Map<String, String> env);
123
    public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners);
124
    public Process launch();
125
}
126

127
// In-process launcher (cluster mode recommended)
128
public class InProcessLauncher extends AbstractLauncher<InProcessLauncher> {
129
    public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners);
130
}
131
```
132

133
[Application Launchers](./launchers.md)
134

135
### Application Handles
136

137
Runtime control and monitoring interface for launched Spark applications. Provides state tracking, application control, and event notifications.
138

139
```java { .api }
140
public interface SparkAppHandle {
141
    void addListener(Listener l);
142
    State getState();
143
    String getAppId();
144
    void stop();
145
    void kill();
146
    void disconnect();
147
    
148
    public enum State {
149
        UNKNOWN(false), CONNECTED(false), SUBMITTED(false), RUNNING(false),
150
        FINISHED(true), FAILED(true), KILLED(true), LOST(true);
151
        
152
        public boolean isFinal();
153
    }
154
    
155
    public interface Listener {
156
        void stateChanged(SparkAppHandle handle);
157
        void infoChanged(SparkAppHandle handle);
158
    }
159
}
160
```
161

162
[Application Handles](./application-handles.md)
163

164
### Configuration Management
165

166
Comprehensive configuration system with predefined constants for common Spark settings and fluent configuration methods.
167

168
```java { .api }
169
public abstract class AbstractLauncher<T extends AbstractLauncher<T>> {
170
    public T setPropertiesFile(String path);
171
    public T setConf(String key, String value);
172
    public T setAppName(String appName);
173
    public T setMaster(String master);
174
    public T setDeployMode(String mode);
175
    public T setAppResource(String resource);
176
    public T setMainClass(String mainClass);
177
    public T addJar(String jar);
178
    public T addFile(String file);
179
    public T addPyFile(String file);
180
    public T addAppArgs(String... args);
181
    public T setVerbose(boolean verbose);
182
}
183

184
// Configuration constants in SparkLauncher
185
public static final String DRIVER_MEMORY = "spark.driver.memory";
186
public static final String EXECUTOR_MEMORY = "spark.executor.memory";
187
public static final String EXECUTOR_CORES = "spark.executor.cores";
188
// ... additional constants
189
```
190

191
[Configuration Management](./configuration.md)
192

193
## Common Use Cases
194

195
### Batch Job Orchestration
196
Use `SparkLauncher` with monitoring to manage batch processing pipelines, track job completion, and handle failures gracefully.
197

198
### Interactive Application Management
199
Leverage `SparkAppHandle` state notifications to build interactive dashboards that display real-time Spark application status.
200

201
### Cluster Resource Management
202
Deploy applications to YARN, Mesos, or Kubernetes clusters using cluster mode with proper resource allocation through configuration constants.
203

204
### Development and Testing
205
Use local mode execution for development and testing with simplified configuration and immediate feedback.
206

207
## Environment Requirements
208

209
- **Spark Installation**: Child process launches require SPARK_HOME environment variable or explicit `setSparkHome()` configuration
210
- **Java Runtime**: Custom JAVA_HOME can be set via `setJavaHome()` method
211
- **Classpath**: In-process launches require Spark dependencies in application classpath
212
- **Cluster Integration**: Supports YARN, Mesos, Kubernetes, and Standalone cluster managers
213
- **Platform Support**: Cross-platform with Windows-specific command handling
214

215
## Error Handling
216

217
The library provides multiple layers of error handling:
218

219
- **Configuration Validation**: Parameter validation with descriptive error messages
220
- **Launch Failures**: IOException handling for process creation failures
221
- **Runtime Monitoring**: State-based error detection through `SparkAppHandle.State.FAILED`
222
- **Connection Issues**: Timeout handling for launcher server communication
223
- **Process Management**: Robust child process lifecycle management with cleanup

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/