CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/pypi-toil

Pipeline management software for clusters.

Agent Success

Agent success rate when using this tile

67%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.05x

Baseline

Agent success rate without this tile

64%

Overview
Eval results
Files

task.mdevals/scenario-3/

Multi-Zone Cloud Workflow System

Overview { .overview }

Create a Python application that defines and configures a computational workflow for execution on cloud infrastructure with multi-availability-zone deployment and spot instance optimization. The system should demonstrate proper workflow construction with resource requirements and cloud provisioning configuration for cost-efficient distributed execution.

Requirements { .requirements }

Core Functionality

  1. Workflow Definition: Create a simple data processing workflow that includes:

    • A parent job that coordinates processing
    • At least 3 child jobs that can run in parallel
    • Each job should declare memory and CPU core requirements
  2. Preemptible Resource Configuration: Configure jobs to:

    • Mark jobs as suitable for preemptible (spot) instance execution
    • Specify appropriate resource requirements for each job (memory in format like "2G", cores as integers)
  3. Multi-Zone Provisioning Setup: Create a provisioning configuration that:

    • Specifies multiple availability zones (at least 2 zones)
    • Enables spot instance usage for cost optimization
    • Configures the node types and resource pools for the cluster

Implementation Details

  • Define at least 4 jobs total (1 parent + 3 children)
  • Use appropriate memory specifications (e.g., "1G", "2G", "4G") and core counts (1-2 cores)
  • Jobs should be marked as preemptible where appropriate to leverage spot pricing
  • Availability zone configuration should include zone identifiers (e.g., "us-west-2a", "us-west-2b")
  • The workflow structure should properly express parent-child relationships

Test Cases { .test-cases }

Test 1: Job Hierarchy @test { .test }

Input: Workflow with parent and child jobs

Expected Behavior:

  • Parent job has at least 3 child jobs attached
  • Child relationships are properly defined
  • Job hierarchy is correctly structured

Test File: test_workflow.py

Test 2: Preemptible Configuration @test { .test }

Input: Jobs with resource requirements and preemptibility settings

Expected Behavior:

  • At least one job is configured as preemptible
  • Each job specifies memory in string format (e.g., "2G")
  • Each job specifies CPU cores as numeric value

Test File: test_workflow.py

Test 3: Multi-Zone Setup @test { .test }

Input: Provisioning configuration with zone specifications

Expected Behavior:

  • Configuration includes at least 2 availability zones
  • Zone identifiers are properly formatted
  • Spot instance usage is enabled in configuration

Test File: test_workflow.py

Dependencies { .dependencies }

toil { .dependency }

Provides scalable workflow management and cloud provisioning capabilities for distributed computational pipelines.

Constraints { .constraints }

  • Use Python 3.7 or higher
  • Solution should work with major cloud providers (AWS or Google Cloud)
  • Configuration should be parameterizable (not hardcoded)
  • Include basic logging to track workflow progress

Deliverables { .deliverables }

  1. workflow_manager.py: Main implementation containing workflow definition and provisioning configuration
  2. test_workflow.py: Test suite covering the three test cases
tessl i tessl/pypi-toil@9.0.0

tile.json