tessl/pypi-kedro

Kedro helps you build production-ready data and analytics pipelines

Overall
score

98%

Overview

Eval results

Files

Parallel Data Processing Pipeline

Name: tessl/pypi-kedro
Author: tessl

A data processing system that executes computational tasks in parallel while safely sharing intermediate data between worker processes.

Capabilities

Pipeline Construction

Build a data processing pipeline with three sequential nodes:

Given input list [1, 2, 3, 4, 5], the first node doubles each value to produce [2, 4, 6, 8, 10] @test
Given [2, 4, 6, 8, 10], the second node filters values > 5 to produce [6, 8, 10] @test
Given [6, 8, 10], the third node calculates sum (24) and mean (8.0) returning {"sum": 24, "mean": 8.0} @test

Parallel Execution with Shared Memory

Execute the pipeline using parallel processing with memory sharing:

The pipeline runs successfully with a parallel runner using multiprocessing @test
Intermediate results are shared between worker processes without serialization errors @test
The final output matches the expected results when run in parallel mode @test

Implementation

@generates

API

def double_values(data: list) -> list:
    """
    Double each value in the input list.

    Args:
        data: List of numeric values

    Returns:
        List with each value doubled
    """
    pass

def filter_values(data: list) -> list:
    """
    Filter values greater than 5.

    Args:
        data: List of numeric values

    Returns:
        List containing only values > 5
    """
    pass

def calculate_stats(data: list) -> dict:
    """
    Calculate sum and mean of values.

    Args:
        data: List of numeric values

    Returns:
        Dictionary with 'sum' and 'mean' keys
    """
    pass

def create_pipeline():
    """
    Create a pipeline with three nodes for transformation, filtering, and aggregation.

    Returns:
        A pipeline object configured for the data processing workflow
    """
    pass

def run_pipeline(input_data: list) -> dict:
    """
    Execute the pipeline in parallel mode with shared memory datasets.

    Args:
        input_data: List of numeric values to process

    Returns:
        Dictionary with summary statistics containing 'sum' and 'mean'
    """
    pass

Dependencies { .dependencies }

kedro { .dependency }

Provides pipeline construction and parallel execution capabilities

Install with Tessl CLI

npx tessl i tessl/pypi-kedro

tile.json