CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-co-cask-cdap--cdap-api

Core application programming interface for the Cask Data Application Platform enabling development of scalable data processing applications on Hadoop ecosystems.

Pending
Overview
Eval results
Files

scheduling.mddocs/

Scheduling and Triggers

CDAP provides a flexible scheduling system that allows programs to be triggered based on time, data availability, or other program status changes. The scheduling system supports constraints and complex trigger combinations.

Capabilities

Schedule Builder

Creates configurable schedules with triggers and constraints for automated program execution.

/**
 * Builder for scheduling a program with triggers and constraints
 */
public interface ScheduleBuilder {
    ScheduleBuilder setDescription(String description);
    ScheduleBuilder setProperties(Map<String, String> properties);
    ScheduleBuilder setTimeout(long time, TimeUnit unit);
    
    // Constraints
    ConstraintProgramScheduleBuilder withConcurrency(int max);
    ScheduleBuilder withDelay(long delay, TimeUnit unit);
    ConstraintProgramScheduleBuilder withTimeWindow(String startTime, String endTime);
    ConstraintProgramScheduleBuilder withTimeWindow(String startTime, String endTime, TimeZone timeZone);
    ConstraintProgramScheduleBuilder withDurationSinceLastRun(long duration, TimeUnit unit);
    
    // Trigger types
    ScheduleCreationSpec triggerByTime(String cronExpression);
    ScheduleCreationSpec triggerOnPartitions(String datasetName, int numPartitions);
    ScheduleCreationSpec triggerOnPartitions(String datasetNamespace, String datasetName, int numPartitions);
    ScheduleCreationSpec triggerOnProgramStatus(String programNamespace, String application, String appVersion,
                                              ProgramType programType, String program, ProgramStatus... programStatuses);
    ScheduleCreationSpec triggerOnProgramStatus(String programNamespace, String application, ProgramType programType,
                                              String program, ProgramStatus... programStatuses);
    ScheduleCreationSpec triggerOnProgramStatus(String application, ProgramType programType,
                                              String program, ProgramStatus... programStatuses);
    ScheduleCreationSpec triggerOnProgramStatus(ProgramType programType, String program, ProgramStatus... programStatuses);
    ScheduleCreationSpec triggerOn(Trigger trigger);
}

Constraint Program Schedule Builder

Extends ScheduleBuilder with additional constraint configuration for programs that require specific execution conditions.

public interface ConstraintProgramScheduleBuilder extends ScheduleBuilder {
    // Inherits all ScheduleBuilder methods with additional constraint capabilities
}

Trigger Factory

Provides factory methods for creating various types of triggers that can be combined in complex scheduling scenarios.

public interface TriggerFactory {
    // Factory methods for creating triggers
}

Schedule Creation Specification

Represents a complete schedule configuration ready for deployment.

public class ScheduleCreationSpec {
    // Complete schedule specification
}

Usage Examples:

import co.cask.cdap.api.schedule.ScheduleBuilder;
import co.cask.cdap.api.app.ProgramType;

// Time-based scheduling
ScheduleCreationSpec timeSchedule = buildSchedule("daily-job", ProgramType.WORKFLOW, "DataProcessing")
    .setDescription("Run data processing workflow daily at 2 AM")
    .withTimeWindow("02:00", "06:00")  // Only run between 2-6 AM
    .withConcurrency(1)  // Only one instance at a time
    .triggerByTime("0 0 2 * * ?");  // Cron: daily at 2 AM

// Partition-based scheduling
ScheduleCreationSpec partitionSchedule = buildSchedule("partition-processor", ProgramType.SPARK, "PartitionAnalysis")
    .setDescription("Process when new partitions are available")
    .withDelay(5, TimeUnit.MINUTES)  // Wait 5 minutes after trigger
    .triggerOnPartitions("input-dataset", 10);  // Trigger when 10+ new partitions

// Program status-based scheduling
ScheduleCreationSpec statusSchedule = buildSchedule("cleanup-job", ProgramType.MAPREDUCE, "CleanupJob")
    .setDescription("Run cleanup after main processing completes")
    .triggerOnProgramStatus(ProgramType.WORKFLOW, "MainWorkflow", ProgramStatus.COMPLETED);

Trigger Types

Time-based Triggers

  • Cron expressions: Standard cron syntax for time-based scheduling
  • Time windows: Restrict execution to specific time ranges
  • Delays: Add delays between trigger and execution

Data-based Triggers

  • Partition triggers: Execute when new dataset partitions are available
  • Minimum partition count: Wait for a specific number of new partitions

Program Status Triggers

  • Status transitions: React to other program completions or failures
  • Cross-application triggers: Schedule based on programs in other applications
  • Cascading workflows: Chain program executions based on status

Constraints

Schedules can include constraints that must be satisfied before program execution:

  • Concurrency limits: Control maximum parallel executions
  • Time windows: Restrict execution to specific time periods
  • Duration since last run: Prevent too-frequent executions
  • Timeouts: Automatically cancel jobs that don't execute within time limits

Install with Tessl CLI

npx tessl i tessl/maven-co-cask-cdap--cdap-api

docs

annotations.md

application-framework.md

dataset-management.md

index.md

mapreduce-programs.md

plugin-framework.md

scheduling.md

service-programs.md

spark-programs.md

system-services.md

transactions.md

worker-programs.md

workflow-programs.md

tile.json