or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

authentication.mdblock-fetching.mdfile-management.mdindex.mdmesos.mdprotocol.mdshuffle-client.mdshuffle-server.md
tile.json

authentication.mddocs/

Authentication and Security

SASL-based authentication system for securing shuffle operations between clients and external shuffle services.

Capabilities

ShuffleSecretManager

Manages shuffle secrets for external shuffle service authentication.

/**
 * Manages shuffle secrets for external shuffle service authentication
 */
public class ShuffleSecretManager implements SecretKeyHolder {
    /**
     * Default SASL user name for Spark shuffle operations
     */
    private static final String SPARK_SASL_USER = "sparkSaslUser";
    
    /**
     * Create a new shuffle secret manager
     */
    public ShuffleSecretManager();
    
    /**
     * Register an application with its shuffle secret
     * @param appId - Application ID to register
     * @param shuffleSecret - Secret key for the application as string
     */
    public void registerApp(String appId, String shuffleSecret);
    
    /**
     * Register an application with its shuffle secret
     * @param appId - Application ID to register
     * @param shuffleSecret - Secret key for the application as ByteBuffer
     */
    public void registerApp(String appId, ByteBuffer shuffleSecret);
    
    /**
     * Unregister an application and remove its secret
     * @param appId - Application ID to unregister
     */
    public void unregisterApp(String appId);
    
    /**
     * Get the SASL user name for an application
     * @param appId - Application ID
     * @return SASL user name (typically SPARK_SASL_USER)
     */
    @Override
    public String getSaslUser(String appId);
    
    /**
     * Get the secret key for an application
     * @param appId - Application ID
     * @return Secret key as string, or null if not registered
     */
    @Override
    public String getSecretKey(String appId);
}

Usage Examples:

import org.apache.spark.network.sasl.ShuffleSecretManager;
import org.apache.spark.network.shuffle.ExternalShuffleClient;
import org.apache.spark.network.util.TransportConf;

// Create shuffle secret manager
ShuffleSecretManager secretManager = new ShuffleSecretManager();

// Register applications with their secrets
String appId1 = "app-20231201-001";
String appId2 = "app-20231201-002";
String secret1 = "mySecretKey123";
String secret2 = "anotherSecretKey456";

secretManager.registerApp(appId1, secret1);
secretManager.registerApp(appId2, secret2);

// Verify registration
String retrievedSecret = secretManager.getSecretKey(appId1);
System.out.println("Retrieved secret for " + appId1 + ": " + (retrievedSecret != null ? "OK" : "MISSING"));

String saslUser = secretManager.getSaslUser(appId1);
System.out.println("SASL user for " + appId1 + ": " + saslUser);

// Use with external shuffle client for authenticated connections
TransportConf conf = new TransportConf("shuffle");
ExternalShuffleClient authenticatedClient = new ExternalShuffleClient(
    conf, secretManager, true, 10000  // authEnabled = true
);

// Register ByteBuffer secret (alternative method)
ByteBuffer secretBuffer = ByteBuffer.wrap("bufferSecret789".getBytes());
String appId3 = "app-20231201-003";
secretManager.registerApp(appId3, secretBuffer);

// Clean up - unregister applications when done
secretManager.unregisterApp(appId1);
secretManager.unregisterApp(appId2);
secretManager.unregisterApp(appId3);

// Verify cleanup
String cleanedSecret = secretManager.getSecretKey(appId1);
System.out.println("Secret after cleanup: " + (cleanedSecret == null ? "REMOVED" : "STILL PRESENT"));

Authentication Flow

The SASL authentication flow between shuffle clients and servers works as follows:

  1. Secret Registration: Applications register their secrets with ShuffleSecretManager
  2. Client Creation: ExternalShuffleClient is created with authentication enabled
  3. Connection Establishment: Client attempts to connect to shuffle server
  4. SASL Handshake: Client and server perform SASL authentication using shared secret
  5. Authenticated Communication: All subsequent shuffle operations are authenticated

Security Best Practices

  1. Secret Management:

    • Use unique, randomly generated secrets for each application
    • Rotate secrets regularly in production environments
    • Never log or expose secrets in plain text
  2. Authentication Configuration:

    • Always enable authentication in production deployments
    • Use strong secrets with sufficient entropy
    • Configure appropriate timeouts for authentication operations
  3. Network Security:

    • Use TLS/SSL for additional transport security when possible
    • Implement proper firewall rules to restrict shuffle service access
    • Monitor authentication failures for potential security issues
  4. Secret Storage:

    • Store secrets securely outside of application code
    • Use secure key management systems in production
    • Implement proper secret cleanup and disposal

Common Authentication Patterns

// Pattern 1: Basic authentication setup
ShuffleSecretManager secretManager = new ShuffleSecretManager();
secretManager.registerApp("myApp", generateSecureSecret());

ExternalShuffleClient client = new ExternalShuffleClient(
    conf, secretManager, true, 10000
);

// Pattern 2: Multiple application management
ShuffleSecretManager multiAppSecretManager = new ShuffleSecretManager();
Map<String, String> appSecrets = loadAppSecretsFromSecureStorage();

for (Map.Entry<String, String> entry : appSecrets.entrySet()) {
    multiAppSecretManager.registerApp(entry.getKey(), entry.getValue());
}

// Pattern 3: Dynamic secret rotation
public void rotateAppSecret(String appId, String newSecret) {
    secretManager.unregisterApp(appId);
    secretManager.registerApp(appId, newSecret);
    // Notify clients to reconnect with new secret
}

// Pattern 4: Cleanup on application termination
public void cleanupApplication(String appId) {
    try {
        // Perform any necessary cleanup operations
        client.close();
    } finally {
        // Always unregister the application secret
        secretManager.unregisterApp(appId);
    }
}

Integration with Spark Security

The ShuffleSecretManager integrates with Spark's broader security framework:

  • Spark Authentication: Works with spark.authenticate configuration
  • ACLs: Integrates with Spark's access control lists
  • Encryption: Can be combined with Spark's encryption features
  • Kerberos: Compatible with Kerberos-based Spark deployments

Troubleshooting Authentication Issues

Common authentication problems and solutions:

  1. Authentication Failures:

    • Verify secrets match between client and server
    • Check that authentication is enabled on both sides
    • Ensure proper secret registration before client initialization
  2. Connection Timeouts:

    • Increase registrationTimeoutMs for slow networks
    • Check network connectivity between client and server
    • Verify shuffle service is running and accessible
  3. Secret Management Issues:

    • Ensure secrets are registered before client operations
    • Verify secret cleanup doesn't interfere with active connections
    • Check for secret string encoding issues
  4. Performance Impact:

    • Authentication adds small overhead to connections
    • Monitor connection establishment times
    • Consider connection pooling for high-frequency operations