or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

index.mddocs/

YARN Shuffle Service for Apache Spark

Overview

The spark-network-yarn_2.11 package provides an external shuffle service for Apache Spark applications running on YARN clusters. This service runs as a long-running auxiliary service within the YARN NodeManager process, enabling efficient shuffle operations by managing shuffle data storage and retrieval independently of individual Spark executors. The service improves resource utilization and fault tolerance by decoupling shuffle data management from compute resources.

Package Information

  • Name: spark-network-yarn_2.11
  • Type: Java Library
  • Language: Java 8+
  • Version: 1.6.3
  • Maven Coordinates: org.apache.spark:spark-network-yarn_2.11:1.6.3
  • License: Apache-2.0

Installation

Add to your Maven pom.xml:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-network-yarn_2.11</artifactId>
    <version>1.6.3</version>
</dependency>

Core Imports

import org.apache.spark.network.yarn.YarnShuffleService;
import org.apache.spark.network.yarn.util.HadoopConfigProvider;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.server.api.*;

Basic Usage

The YARN Shuffle Service is typically deployed and managed by YARN cluster administrators rather than being instantiated directly by application code. However, for testing or custom deployments:

// Create and configure the shuffle service
YarnShuffleService shuffleService = new YarnShuffleService();

// Initialize with Hadoop configuration
Configuration hadoopConfig = new Configuration();
hadoopConfig.setInt("spark.shuffle.service.port", 7337);
hadoopConfig.setBoolean("spark.authenticate", false);

shuffleService.init(hadoopConfig);

// Service lifecycle is managed by YARN framework
// Applications connect by setting spark.shuffle.service.enabled=true

Architecture

The service integrates with YARN's auxiliary service framework and Spark's network transport layer:

  • YarnShuffleService: Main service class extending YARN's AuxiliaryService
  • HadoopConfigProvider: Configuration bridge between Hadoop and Spark network layer
  • ExternalShuffleBlockHandler: Handles shuffle block requests (from spark-network-shuffle dependency)
  • SASL Authentication: Optional security layer for multi-tenant clusters

Capabilities

YARN Auxiliary Service Integration

The primary interface for YARN integration through the auxiliary service framework:

public class YarnShuffleService extends AuxiliaryService { .api }

Constructor:

public YarnShuffleService() { .api }

Creates a new shuffle service instance with service name "spark_shuffle".

Application Lifecycle Management:

public void initializeApplication(ApplicationInitializationContext context) { .api }
public void stopApplication(ApplicationTerminationContext context) { .api }

Container Lifecycle Management:

public void initializeContainer(ContainerInitializationContext context) { .api }
public void stopContainer(ContainerTerminationContext context) { .api }

Service Lifecycle:

protected void serviceInit(Configuration conf) { .api }
protected void serviceStop() { .api }
public ByteBuffer getMetaData() { .api }

Configuration Integration

Provides integration between Hadoop Configuration and Spark's network layer:

public class HadoopConfigProvider extends ConfigProvider { .api }

Constructor:

public HadoopConfigProvider(Configuration conf) { .api }

Creates a configuration provider that uses Hadoop Configuration as the backing store.

Configuration Access:

public String get(String name) throws NoSuchElementException { .api }

Retrieves configuration values by name, throwing NoSuchElementException if the key is not found.

Service Configuration

Key Configuration Properties:

  • spark.shuffle.service.port (default: 7337): Port for the shuffle service
  • spark.authenticate (default: false): Enable SASL authentication
  • yarn.nodemanager.local-dirs: Local directories for executor state persistence

Example Configuration:

Configuration conf = new Configuration();
conf.setInt("spark.shuffle.service.port", 7337);
conf.setBoolean("spark.authenticate", true);

// For multi-directory setups
conf.set("yarn.nodemanager.local-dirs", "/tmp/yarn-local-1,/tmp/yarn-local-2");

Authentication and Security

When authentication is enabled, the service integrates with Spark's SASL authentication:

// Authentication is configured via YARN configuration
conf.setBoolean("spark.authenticate", true);

// Applications must also set spark.authenticate independently
// and provide shuffle secrets during application initialization

Executor State Persistence

The service automatically persists executor registration information to survive NodeManager restarts:

// State is persisted to registeredExecutors.ldb in local directories
// Recovery happens automatically during service initialization
// No direct API for state management - handled internally

Testing and Debugging

Testing Support:

// Static fields available for testing (marked @VisibleForTesting)
public static int boundPort { .api }           // Actual bound port
public static YarnShuffleService instance { .api }  // Service instance

// Package-visible testing access
ExternalShuffleBlockHandler blockHandler { .api }    // Block handler instance
File registeredExecutorFile { .api }                 // State persistence file

Error Handling

The service handles various error conditions gracefully:

  • Initialization Errors: Service continues startup even if shuffle handler initialization fails
  • Application Errors: Application lifecycle errors are logged but don't affect other applications
  • State Corruption: Corrupt executor state files are detected and recovered automatically
  • Network Errors: Handled by the underlying Spark transport layer

Deployment Integration

YARN Configuration

Add to yarn-site.xml:

<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>spark_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
    <value>org.apache.spark.network.yarn.YarnShuffleService</value>
  </property>
  <property>
    <name>spark.shuffle.service.port</name>
    <value>7337</value>
  </property>
</configuration>

Spark Application Configuration

// In Spark application configuration
sparkConf.set("spark.shuffle.service.enabled", "true");
// Port is automatically discovered from YARN configuration

Compatibility

  • Spark Version: 1.6.3
  • Hadoop/YARN: Compatible with Hadoop 2.x YARN clusters
  • Java: Requires Java 8 or later
  • Scala: Binary compatible with Scala 2.10 and 2.11