tessl install tessl/pypi-apache-airflow-backport-providers-apache-sqoop@2021.3.0Apache Airflow backport provider package for Apache Sqoop integration, providing SqoopHook and SqoopOperator for data import/export between relational databases and Hadoop
Agent Success
Agent success rate when using this tile
92%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.39x
Baseline
Agent success rate without this tile
66%
Create a small utility that pulls data from a relational source directly into a Hive/HCatalog table, using the package's HCatalog-aware import support and an opt-in table creation flag.
create_table is true, the job creates it before loading; if create_table is false, a missing table yields a clear failure, and an existing table is left intact. @test{"dt": "2024-11-11"}), the import lands data under that static partition path within the Hive table without overwriting other partitions. @test@generates
from dataclasses import dataclass
from typing import Dict, Optional
@dataclass
class HCatalogImportConfig:
source_table: str
warehouse_dir: str
hcatalog_database: str
hcatalog_table: str
create_table: bool
partition: Optional[Dict[str, str]] = None
split_by: Optional[str] = None
num_mappers: int = 1
extra_options: Optional[Dict[str, str]] = None
@dataclass
class ImportResult:
warehouse_path: str
rows_imported: int
def run_hcatalog_import(config: HCatalogImportConfig) -> ImportResult:
"""Runs an HCatalog-targeted import job using the configured data movement provider."""Provides Sqoop-based HCatalog import capabilities and optional Hive table creation.