or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/kedro@1.1.x
tile.json

tessl/pypi-kedro

tessl install tessl/pypi-kedro@1.1.0

Kedro helps you build production-ready data and analytics pipelines

Agent Success

Agent success rate when using this tile

98%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.32x

Baseline

Agent success rate without this tile

74%

task.mdevals/scenario-14/

Data Processing Pipeline Manager

A utility that manages data processing workflows using a centralized data catalog system. The manager should provide a unified interface for loading, saving, and checking the existence of datasets.

Capabilities

Dataset Management

  • Given a catalog configuration dictionary, it creates a data catalog instance and saves a pandas DataFrame to a dataset named "raw_data" @test
  • Given a catalog with a saved dataset "processed_data", it loads the dataset and returns the data @test
  • Given a catalog, it correctly identifies whether a dataset "metrics" exists or not @test

Custom Dataset Support

  • It creates and registers a custom in-memory dataset that stores Python dictionaries @test

Implementation

@generates

API

from typing import Any, Dict

def create_catalog_from_config(config: Dict[str, Any]) -> Any:
    """
    Create a DataCatalog instance from a configuration dictionary.

    Args:
        config: Configuration dictionary for the catalog

    Returns:
        Data catalog instance
    """
    pass

def save_dataset(catalog: Any, dataset_name: str, data: Any) -> None:
    """
    Save data to a dataset in the catalog.

    Args:
        catalog: The data catalog instance
        dataset_name: Name of the dataset to save to
        data: Data to save
    """
    pass

def load_dataset(catalog: Any, dataset_name: str) -> Any:
    """
    Load data from a dataset in the catalog.

    Args:
        catalog: The data catalog instance
        dataset_name: Name of the dataset to load from

    Returns:
        The loaded data
    """
    pass

def dataset_exists(catalog: Any, dataset_name: str) -> bool:
    """
    Check if a dataset exists in the catalog.

    Args:
        catalog: The data catalog instance
        dataset_name: Name of the dataset to check

    Returns:
        True if dataset exists, False otherwise
    """
    pass

class DictMemoryDataset:
    """
    A custom in-memory dataset for storing Python dictionaries.
    Should implement load() and save() methods to work with the data catalog.
    """
    pass

Dependencies { .dependencies }

kedro { .dependency }

Provides data catalog management and dataset abstractions.