The spark-network-yarn_2.11 package provides an external shuffle service for Apache Spark applications running on YARN clusters. This service runs as a long-running auxiliary service within the YARN NodeManager process, enabling efficient shuffle operations by managing shuffle data storage and retrieval independently of individual Spark executors. The service improves resource utilization and fault tolerance by decoupling shuffle data management from compute resources.
spark-network-yarn_2.11org.apache.spark:spark-network-yarn_2.11:1.6.3Add to your Maven pom.xml:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-network-yarn_2.11</artifactId>
<version>1.6.3</version>
</dependency>import org.apache.spark.network.yarn.YarnShuffleService;
import org.apache.spark.network.yarn.util.HadoopConfigProvider;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.yarn.server.api.*;The YARN Shuffle Service is typically deployed and managed by YARN cluster administrators rather than being instantiated directly by application code. However, for testing or custom deployments:
// Create and configure the shuffle service
YarnShuffleService shuffleService = new YarnShuffleService();
// Initialize with Hadoop configuration
Configuration hadoopConfig = new Configuration();
hadoopConfig.setInt("spark.shuffle.service.port", 7337);
hadoopConfig.setBoolean("spark.authenticate", false);
shuffleService.init(hadoopConfig);
// Service lifecycle is managed by YARN framework
// Applications connect by setting spark.shuffle.service.enabled=trueThe service integrates with YARN's auxiliary service framework and Spark's network transport layer:
The primary interface for YARN integration through the auxiliary service framework:
public class YarnShuffleService extends AuxiliaryService { .api }Constructor:
public YarnShuffleService() { .api }Creates a new shuffle service instance with service name "spark_shuffle".
Application Lifecycle Management:
public void initializeApplication(ApplicationInitializationContext context) { .api }
public void stopApplication(ApplicationTerminationContext context) { .api }Container Lifecycle Management:
public void initializeContainer(ContainerInitializationContext context) { .api }
public void stopContainer(ContainerTerminationContext context) { .api }Service Lifecycle:
protected void serviceInit(Configuration conf) { .api }
protected void serviceStop() { .api }
public ByteBuffer getMetaData() { .api }Provides integration between Hadoop Configuration and Spark's network layer:
public class HadoopConfigProvider extends ConfigProvider { .api }Constructor:
public HadoopConfigProvider(Configuration conf) { .api }Creates a configuration provider that uses Hadoop Configuration as the backing store.
Configuration Access:
public String get(String name) throws NoSuchElementException { .api }Retrieves configuration values by name, throwing NoSuchElementException if the key is not found.
Key Configuration Properties:
spark.shuffle.service.port (default: 7337): Port for the shuffle servicespark.authenticate (default: false): Enable SASL authenticationyarn.nodemanager.local-dirs: Local directories for executor state persistenceExample Configuration:
Configuration conf = new Configuration();
conf.setInt("spark.shuffle.service.port", 7337);
conf.setBoolean("spark.authenticate", true);
// For multi-directory setups
conf.set("yarn.nodemanager.local-dirs", "/tmp/yarn-local-1,/tmp/yarn-local-2");When authentication is enabled, the service integrates with Spark's SASL authentication:
// Authentication is configured via YARN configuration
conf.setBoolean("spark.authenticate", true);
// Applications must also set spark.authenticate independently
// and provide shuffle secrets during application initializationThe service automatically persists executor registration information to survive NodeManager restarts:
// State is persisted to registeredExecutors.ldb in local directories
// Recovery happens automatically during service initialization
// No direct API for state management - handled internallyTesting Support:
// Static fields available for testing (marked @VisibleForTesting)
public static int boundPort { .api } // Actual bound port
public static YarnShuffleService instance { .api } // Service instance
// Package-visible testing access
ExternalShuffleBlockHandler blockHandler { .api } // Block handler instance
File registeredExecutorFile { .api } // State persistence fileThe service handles various error conditions gracefully:
Add to yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>spark_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>spark.shuffle.service.port</name>
<value>7337</value>
</property>
</configuration>// In Spark application configuration
sparkConf.set("spark.shuffle.service.enabled", "true");
// Port is automatically discovered from YARN configuration