Kedro helps you build production-ready data and analytics pipelines
Overall
score
98%
A utility that manages data processing workflows using a centralized data catalog system. The manager should provide a unified interface for loading, saving, and checking the existence of datasets.
@generates
from typing import Any, Dict
def create_catalog_from_config(config: Dict[str, Any]) -> Any:
"""
Create a DataCatalog instance from a configuration dictionary.
Args:
config: Configuration dictionary for the catalog
Returns:
Data catalog instance
"""
pass
def save_dataset(catalog: Any, dataset_name: str, data: Any) -> None:
"""
Save data to a dataset in the catalog.
Args:
catalog: The data catalog instance
dataset_name: Name of the dataset to save to
data: Data to save
"""
pass
def load_dataset(catalog: Any, dataset_name: str) -> Any:
"""
Load data from a dataset in the catalog.
Args:
catalog: The data catalog instance
dataset_name: Name of the dataset to load from
Returns:
The loaded data
"""
pass
def dataset_exists(catalog: Any, dataset_name: str) -> bool:
"""
Check if a dataset exists in the catalog.
Args:
catalog: The data catalog instance
dataset_name: Name of the dataset to check
Returns:
True if dataset exists, False otherwise
"""
pass
class DictMemoryDataset:
"""
A custom in-memory dataset for storing Python dictionaries.
Should implement load() and save() methods to work with the data catalog.
"""
passProvides data catalog management and dataset abstractions.
Install with Tessl CLI
npx tessl i tessl/pypi-kedro