or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

batch-systems.md core-workflow.md file-management.md index.md job-stores.md provisioning.md utilities.md workflow-languages.md

tile.json

tessl/pypi-toil

Pipeline management software for clusters.

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/toil@9.0.x

To install, run

npx @tessl/cli install tessl/pypi-toil@9.0.0

Toil Python Package

Overview

Toil is a comprehensive Python pipeline management and workflow execution system designed for distributed computing environments. It provides robust job scheduling, cloud provisioning, and container execution capabilities across various batch systems including Slurm, LSF, Kubernetes, and local execution. Toil supports multiple workflow languages including CWL (Common Workflow Language) and WDL (Workflow Description Language), making it a versatile solution for scientific computing and data processing pipelines.

Package Information

Package Name: toil
Version: 9.0.0 (managed dynamically via toil.version module)
Description: Pipeline management software for clusters supporting distributed computing, cloud provisioning, and container execution
Main Module: toil
Installation: pip install toil

Core Imports

# Essential workflow components
from toil.common import Toil, Config
from toil.job import Job, JobDescription, Promise, AcceleratorRequirement
from toil.fileStores import AbstractFileStore, FileID

# Exception handling
from toil.exceptions import FailedJobsException

# Utility functions
from toil.lib.conversions import human2bytes, bytes2human
from toil.lib.retry import retry
from toil import physicalMemory, physicalDisk, toilPackageDirPath

Basic Usage

Simple Workflow Creation

from toil.common import Toil, Config
from toil.job import Job

class HelloWorldJob(Job):
    def __init__(self, message):
        # 100MB memory, 1 core, 100MB disk
        super().__init__(memory=100*1024*1024, cores=1, disk=100*1024*1024)
        self.message = message
    
    def run(self, fileStore):
        fileStore.logToMaster(f"Hello {self.message}")
        return f"Processed: {self.message}"

# Create and run workflow
if __name__ == "__main__":
    config = Config()
    config.jobStore = "file:my-job-store"
    config.logLevel = "INFO"
    
    with Toil(config) as toil:
        root_job = HelloWorldJob("World")
        result = toil.start(root_job)
        print(f"Result: {result}")

Function-Based Jobs

from toil.common import Toil, Config
from toil.job import Job

@Job.wrapJobFn
def process_data(job, input_data):
    # Job automatically gets memory=2G, cores=1, disk=2G by default
    job.fileStore.logToMaster(f"Processing: {input_data}")
    return input_data.upper()

@Job.wrapJobFn 
def combine_results(job, *results):
    combined = " + ".join(results)
    job.fileStore.logToMaster(f"Combined: {combined}")
    return combined

if __name__ == "__main__":
    config = Config()
    config.jobStore = "file:my-job-store"
    
    with Toil(config) as toil:
        # Create processing jobs
        job1 = process_data("hello")
        job2 = process_data("world")
        
        # Chain jobs together
        final_job = combine_results(job1.rv(), job2.rv())
        job1.addFollowOn(final_job)
        job2.addFollowOn(final_job)
        
        result = toil.start(job1)
        print(f"Final result: {result}")

Architecture

Toil's architecture consists of several key components that work together to provide scalable workflow execution:

Core Components

Job Management Layer: Job, JobDescription, and Promise classes handle job definition, scheduling, and result handling
Batch System Layer: Abstracts different compute environments (local, Slurm, Kubernetes, cloud services)
Job Store Layer: Persistent storage for job metadata and workflow state (file system, AWS S3, Google Cloud Storage)
File Store Layer: Manages file I/O operations and temporary file handling during job execution
Leader-Worker Architecture: Centralized leader coordinates job scheduling while distributed workers execute tasks
Provisioning Layer: Automatic cloud resource provisioning and scaling

Workflow Execution Flow

Configuration: Define workflow parameters using Config class
Job Definition: Create job hierarchy using Job classes or function decorators
Workflow Execution: Use Toil context manager to execute the workflow
Resource Management: Automatic allocation and cleanup of compute and storage resources
Result Handling: Collect results through Promise objects and return values

Capabilities

Core Workflow Management

{ .api }

Basic job creation, execution, and chaining capabilities with resource management and promise-based result handling.

Key APIs:

Job(memory, cores, disk, accelerators, preemptible, checkpoint) - Job definition with resource requirements
Job.addChild(childJob) - Add dependent child jobs
Job.addFollowOn(followOnJob) - Add sequential follow-on jobs
Job.rv(*path) - Create promise for job return value
Toil(config).start(rootJob) - Execute workflow with root job
Config() - Workflow configuration and batch system settings

Core Workflow Management

Batch System Integration

{ .api }

Support for multiple compute environments including local execution, HPC schedulers, and cloud services.

Key APIs:

AbstractBatchSystem.issueBatchJob(jobNode) - Submit job to batch system
AbstractBatchSystem.getUpdatedBatchJob(maxWait) - Monitor job status
AbstractScalableBatchSystem.nodeTypes() - Query available node types
KubernetesBatchSystem, SlurmBatchSystem, LSFBatchSystem - Concrete implementations
BatchJobExitReason - Job completion status enumeration

Batch System Integration

Job Store Management

{ .api }

Persistent storage backends for workflow metadata and state management across different storage systems.

Key APIs:

AbstractJobStore.create(jobDescription) - Store job metadata
AbstractJobStore.load(jobStoreID) - Retrieve job by ID
AbstractJobStore.writeFile(localFilePath) - Store file in job store
AbstractJobStore.importFile(srcUrl, sharedFileName) - Import external files
FileJobStore, AWSJobStore, GoogleJobStore - Storage backend implementations

Job Store Management

File Management

{ .api }

Comprehensive file handling for temporary files, shared data, and persistent storage during workflow execution.

Key APIs:

AbstractFileStore.writeGlobalFile(localFileName) - Store globally accessible files
AbstractFileStore.readGlobalFile(fileStoreID, userPath, cache) - Read shared files
AbstractFileStore.getLocalTempDir() - Get temporary directory
AbstractFileStore.logToMaster(text, level) - Send logs to workflow leader
FileID - File identifier type for referencing stored files

File Management

Workflow Language Integration

{ .api }

Native support for CWL and WDL workflow specifications with seamless translation to Toil execution.

Key APIs:

toil-cwl-runner - Command-line CWL workflow execution
toil-wdl-runner - Command-line WDL workflow execution
toil.cwl.cwltoil.main() - Programmatic CWL execution
toil.wdl.wdltoil.main() - Programmatic WDL execution
CWL and WDL utility functions for workflow processing

Workflow Language Integration

Cloud Provisioning

{ .api }

Automatic cloud resource provisioning and cluster management for scalable workflow execution.

Key APIs:

AbstractProvisioner - Base provisioner interface
AWS, Google Cloud, Azure provisioners - Cloud-specific implementations
toil-launch-cluster - Cluster creation utility
toil-destroy-cluster - Cluster cleanup utility
Dynamic node scaling and resource management

Cloud Provisioning

Utilities and CLI Tools

{ .api }

Comprehensive command-line tools and utilities for workflow management, debugging, and monitoring.

Key APIs:

toil - Main CLI interface for workflow execution
toil-stats - Statistics collection and analysis
toil-status - Workflow monitoring and status
toil-clean - Cleanup utilities and job store management
toil-kill - Workflow termination utilities
Various debugging and cluster management tools

Utilities and CLI Tools

Version

Tile

Files

tessl/pypi-toil

To install, run

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Toil Python Package

Overview

Package Information

Core Imports

Basic Usage

Simple Workflow Creation

Function-Based Jobs

Architecture

Core Components

Workflow Execution Flow

Capabilities

Core Workflow Management

Batch System Integration

Job Store Management

File Management

Workflow Language Integration

Cloud Provisioning

Utilities and CLI Tools

index.mddocs/