or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/apache-airflow-backport-providers-apache-sqoop@2021.3.x
tile.json

tessl/pypi-apache-airflow-backport-providers-apache-sqoop

tessl install tessl/pypi-apache-airflow-backport-providers-apache-sqoop@2021.3.0

Apache Airflow backport provider package for Apache Sqoop integration, providing SqoopHook and SqoopOperator for data import/export between relational databases and Hadoop

Agent Success

Agent success rate when using this tile

92%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.39x

Baseline

Agent success rate without this tile

66%

task.mdevals/scenario-9/

Hive Export Pipeline

Design an Airflow DAG that exports curated Hive data into a relational database table through the dependency's Sqoop export capability. The DAG must focus on reliably moving data from a specific HDFS directory to a target table with explicit handling for nulls, delimiters, and staging safety.

@generates

Capabilities

Build export DAG

  • build_export_dag returns a DAG containing exactly one export task that moves data from the provided source_dir in HDFS/Hive into the provided target_table using the provided connection id; the DAG id and schedule come from the function arguments. @test

Null and delimiter handling

  • The export task uses the provided null sentinels for string and non-string columns plus the provided field and line delimiters when writing into the database. @test

Staging and cleanup

  • The export task routes writes through the provided staging_table and clears that staging table before loading the final target. @test

Batch and isolation options

  • The export task enables batch execution and relaxed isolation when the corresponding flags are true, preserving all other task options. @test

API

from typing import Optional

def build_export_dag(
    dag_id: str,
    conn_id: str,
    source_dir: str,
    target_table: str,
    staging_table: str,
    null_string: str,
    null_non_string: str,
    field_delimiter: str,
    line_delimiter: str,
    schedule: str = "@daily",
    clear_staging: bool = True,
    use_batch: bool = True,
    relaxed_isolation: bool = True,
) -> "DAG":
    """Creates a DAG that performs the export as described."""

Dependencies { .dependencies }

apache-airflow-backport-providers-apache-sqoop { .dependency }

Provides Airflow support for Sqoop-based imports and exports between HDFS/Hive and relational databases. @satisfied-by