CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-apache-spark--spark-repl-2-11

Interactive Scala shell for Apache Spark with distributed computing capabilities

Pending
Overview
Eval results
Files

signal-handling.mddocs/

Signal Handling

Signal handling utilities for interactive job cancellation and REPL interrupt management.

Capabilities

Signaling Object

Provides signal handling functionality for graceful job cancellation in the REPL environment.

/**
 * Signal handling utilities for REPL interrupt management
 * Provides SIGINT handling to cancel running Spark jobs
 */
private[repl] object Signaling extends Logging {
  /**
   * Register a SIGINT handler that terminates all active Spark jobs
   * or terminates when no jobs are currently running
   * Makes it possible to interrupt a running shell job by pressing Ctrl+C
   */
  def cancelOnInterrupt(): Unit
}

Usage Examples:

import org.apache.spark.repl.Signaling

// Register interrupt handler (typically called during REPL startup)
Signaling.cancelOnInterrupt()

// Now Ctrl+C will:
// 1. Cancel active Spark jobs if any are running
// 2. Exit REPL if no jobs are active

Signal Handling Behavior

SIGINT (Ctrl+C) Handling

The interrupt handler provides intelligent behavior based on Spark job status:

When Spark Jobs Are Active:

  1. First Ctrl+C: Cancels all active Spark jobs
  2. Shows warning message: "Cancelling all active jobs, this can take a while. Press Ctrl+C again to exit now."
  3. Second Ctrl+C: Exits immediately

When No Jobs Are Active:

  1. First Ctrl+C: Exits REPL normally

Implementation Details

The signal handler integrates with Spark's job tracking system:

def cancelOnInterrupt(): Unit = SignalUtils.register("INT") {
  SparkContext.getActive.map { ctx =>
    if (!ctx.statusTracker.getActiveJobIds().isEmpty) {
      logWarning("Cancelling all active jobs, this can take a while. " +
        "Press Ctrl+C again to exit now.")
      ctx.cancelAllJobs()
      true  // Handled - don't exit yet
    } else {
      false // Not handled - allow normal exit
    }
  }.getOrElse(false) // No active context - allow normal exit
}

Integration with Spark

SparkContext Integration

  • Uses SparkContext.getActive to find current Spark context
  • Leverages StatusTracker.getActiveJobIds() to check for running jobs
  • Calls SparkContext.cancelAllJobs() for graceful job termination

Job Cancellation Process

When jobs are cancelled via Ctrl+C:

  1. Signal Detection: SIGINT signal captured by handler
  2. Job Status Check: Verify active jobs exist
  3. Cancellation Request: Send cancellation to all active jobs
  4. User Notification: Display cancellation progress message
  5. Graceful Shutdown: Allow jobs to complete cancellation before exit

Thread Safety

The signal handler operates safely with concurrent job execution:

  • Uses atomic job status checking
  • Handles race conditions between job completion and cancellation
  • Provides consistent behavior across different job types

Error Handling

Missing SparkContext

When no SparkContext is available:

SparkContext.getActive.map { ctx =>
  // Handle interrupts with context
}.getOrElse(false)  // No context - allow normal exit

Exception Handling

Signal handling includes proper exception management:

  • Logging of cancellation operations
  • Graceful handling of job cancellation failures
  • Safe fallback to normal exit behavior

REPL Integration

Startup Registration

Signal handling is automatically registered during REPL startup:

// In Main.scala
object Main extends Logging {
  initializeLogIfNecessary(true)
  Signaling.cancelOnInterrupt()  // Register on startup
  // ... rest of initialization
}

Interactive Experience

Provides smooth interactive experience:

  • Long-running operations: Can be interrupted safely
  • Data exploration: Jobs can be cancelled without losing REPL session
  • Script execution: Runaway computations can be stopped
  • Development workflow: Quick iteration with safe job termination

User Feedback

Clear user communication during interruption:

Cancelling all active jobs, this can take a while. Press Ctrl+C again to exit now.

Platform Integration

Unix Signal Support

Uses Spark's SignalUtils.register() for cross-platform signal handling:

  • Linux/macOS: Native SIGINT handling
  • Windows: Equivalent interrupt handling
  • Containers: Works in Docker/Kubernetes environments

JVM Integration

Integrates properly with JVM signal handling:

  • Registers at JVM level for reliable delivery
  • Handles JVM shutdown hooks appropriately
  • Works with Scala/Java threading models

Advanced Usage

Custom Signal Handling

While the object is private[repl], it demonstrates patterns for custom signal handling:

// Example pattern for custom signal handling
SignalUtils.register("INT") { /* custom handler */ }
SignalUtils.register("TERM") { /* termination handler */ }

Job Management Integration

Can be extended for more sophisticated job management:

  • Job Prioritization: Cancel lower-priority jobs first
  • Selective Cancellation: Cancel specific job types
  • Resource Management: Clean up resources during cancellation
  • State Preservation: Save computation state before cancellation

Best Practices

REPL Usage

  1. Long Operations: Always allow for interruption in long-running operations
  2. Resource Cleanup: Ensure proper cleanup during job cancellation
  3. User Communication: Provide clear feedback during cancellation
  4. Recovery: Design computations to be resumable after interruption

Development Patterns

  1. Checkpointing: Save intermediate results for large computations
  2. Monitoring: Track job progress for better interruption timing
  3. Graceful Degradation: Handle partial results from cancelled jobs
  4. Testing: Test interrupt behavior in development workflows

Logging Integration

Warning Messages

Cancellation events are logged with appropriate levels:

logWarning("Cancelling all active jobs, this can take a while. Press Ctrl+C again to exit now.")

Debug Information

Debug-level logging for signal handling events:

  • Signal registration success/failure
  • Job cancellation initiation
  • Context availability status
  • Handler execution timing

Install with Tessl CLI

npx tessl i tessl/maven-org-apache-spark--spark-repl-2-11

docs

class-loading.md

index.md

interactive-shell.md

main-entry.md

scala-compatibility.md

signal-handling.md

tile.json