Kedro helps you build production-ready data and analytics pipelines
Overall
score
98%
Build a tool that analyzes failed Kedro pipeline runs and provides recommendations for resuming execution from the optimal points.
When data pipelines fail mid-execution, it's wasteful to re-run the entire pipeline from the beginning. Your task is to create a Python module that analyzes a failed pipeline execution and determines the optimal nodes from which to resume execution, minimizing redundant computation while ensuring data consistency.
The tool should:
Your module should accept:
Pipeline object containing multiple nodes with dependenciesDataCatalog object defining datasetsReturn a list of node names (strings) that represent the minimum set of nodes from which the pipeline should resume execution.
@generates
from kedro.pipeline import Pipeline
from kedro.io import DataCatalog
def analyze_resume_points(
pipeline: Pipeline,
catalog: DataCatalog,
failed_node_name: str
) -> list[str]:
"""
Analyze a failed pipeline and determine optimal resume points.
Args:
pipeline: The Kedro pipeline that failed
catalog: The data catalog containing dataset definitions
failed_node_name: Name of the node where execution failed
Returns:
List of node names from which execution should resume
Raises:
ValueError: If failed_node_name is not in the pipeline
"""
passProvides the pipeline and data catalog framework for analyzing pipeline execution failures and determining optimal resume points.
Install with Tessl CLI
npx tessl i tessl/pypi-kedro