Pipeline management software for clusters.
Agent Success
Agent success rate when using this tile
67%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.05x
Baseline
Agent success rate without this tile
64%
Build a pipeline that downloads data files from external sources, processes them, and exports the results to designated locations. The pipeline should handle temporary file management and coordinate data between jobs.
Create a workflow with three jobs that demonstrates proper file management:
Data Downloader Job: Downloads input files from external URLs and makes them available to downstream jobs
Data Processor Job: Processes the downloaded files and generates output
Data Exporter Job: Exports processed files to external destinations
The workflow should:
@generates
from toil.job import Job
class DownloaderJob(Job):
"""Downloads files from external sources into the job store."""
def __init__(self, source_urls):
"""
Args:
source_urls: List of source URLs to download
"""
super().__init__()
self.source_urls = source_urls
def run(self, fileStore):
"""
Import files and return their IDs.
Returns:
List of file IDs in the job store
"""
pass
class ProcessorJob(Job):
"""Processes files from the job store."""
def __init__(self, input_file_ids):
"""
Args:
input_file_ids: List of file IDs to process
"""
super().__init__()
self.input_file_ids = input_file_ids
def run(self, fileStore):
"""
Process input files and create output files.
Returns:
List of output file IDs
"""
pass
class ExporterJob(Job):
"""Exports files from the job store to external destinations."""
def __init__(self, file_ids, dest_urls):
"""
Args:
file_ids: List of file IDs to export
dest_urls: List of destination URLs
"""
super().__init__()
self.file_ids = file_ids
self.dest_urls = dest_urls
def run(self, fileStore):
"""Export files to destinations."""
pass
def create_pipeline(source_urls, dest_urls):
"""
Create the complete file processing pipeline.
Args:
source_urls: List of source file URLs
dest_urls: List of destination file URLs
Returns:
Root job for the pipeline
"""
passProvides workflow management and file handling capabilities.
tessl i tessl/pypi-toil@9.0.0docs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10