tessl install tessl/pypi-kedro@1.1.0Kedro helps you build production-ready data and analytics pipelines
Agent Success
Agent success rate when using this tile
98%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.32x
Baseline
Agent success rate without this tile
74%
A utility that resolves dataset names to their configuration templates based on pattern matching with specificity ranking.
When working with large data pipelines, you often need to handle multiple datasets that follow similar naming conventions. Rather than configuring each dataset individually, pattern-based configuration allows you to define templates that match multiple dataset names. When multiple patterns could match a dataset name, the system needs to determine which pattern is the best match based on specificity.
Build a dataset pattern resolver that accepts a list of pattern definitions and resolves dataset names to their configurations:
{} to denote placeholders (e.g., {namespace}, {name})"data.int_{name}")* serves as a catch-all wildcardWhen multiple patterns match a dataset name:
* is always least specificNone when no pattern matchesGiven patterns [("{namespace}.{name}", {"type": "A"}), ("data.{name}", {"type": "B"}), ("*", {"type": "C"})], resolving "data.sales" returns {"type": "B"} because "data.{name}" is more specific than "{namespace}.{name}" (more literals) @test
Given patterns [("{namespace}.{dataset}", {"type": "A"}), ("{prefix}_{suffix}", {"type": "B"})], resolving "sales.revenue" returns {"type": "A"} because it matches the pattern with a dot separator @test
Given patterns [("raw.{name}", {"type": "A"}), ("*", {"type": "B"})], resolving "processed.data" returns {"type": "B"} because only the catch-all matches @test
Given patterns [("{namespace}.{name}", {"type": "A"})], resolving "sales.revenue" also extracts placeholders as {"namespace": "sales", "name": "revenue"} @test
@generates
class PatternResolver:
"""Resolves dataset names to configurations based on pattern matching with specificity ranking."""
def __init__(self, patterns: list[tuple[str, dict]]):
"""
Initialize the resolver with a list of patterns.
Args:
patterns: List of (pattern_string, config_dict) tuples
"""
pass
def resolve(self, dataset_name: str) -> dict | None:
"""
Resolve a dataset name to its configuration.
Args:
dataset_name: The name of the dataset to resolve
Returns:
The configuration dict for the best matching pattern, or None if no match
"""
pass
def extract_placeholders(self, dataset_name: str) -> dict[str, str] | None:
"""
Extract placeholder values from a dataset name using the best matching pattern.
Args:
dataset_name: The name of the dataset
Returns:
Dictionary mapping placeholder names to their values, or None if no match
"""
passProvides data catalog and pattern matching support.