Ctrl + K
DocumentationLog inGet started

tessl/pypi-apache-airflow-providers-dbt-cloud

tessl install tessl/pypi-apache-airflow-providers-dbt-cloud@4.4.0

Provider package for integrating Apache Airflow with dbt Cloud for data transformation workflow orchestration

Agent Success

Agent success rate when using this tile

84%

Improvement

Agent success rate improvement when using this tile compared to baseline

1x

Baseline

Agent success rate without this tile

84%

task.mdevals/scenario-9/

dbt Cloud Artifact Batch Downloader

A utility for efficiently downloading multiple artifacts from dbt Cloud job runs simultaneously using asynchronous I/O.

Background

When analyzing dbt Cloud job runs, you often need to retrieve multiple artifacts (like manifest.json, run_results.json, and catalog.json) from the same job run. Downloading these files sequentially can be slow, especially when network latency is high. This utility should download multiple artifacts concurrently to minimize total download time.

Requirements

Concurrent Download Capability

Your solution should download multiple artifacts from a single dbt Cloud job run simultaneously rather than sequentially. The solution should:

  • Accept a list of artifact paths to download (e.g., ["manifest.json", "run_results.json", "catalog.json"])
  • Download all artifacts concurrently using asynchronous I/O
  • Return all downloaded artifact contents
  • Handle errors gracefully if an artifact is not available

Configuration

The utility should accept the following inputs:

  • Account ID for the dbt Cloud account
  • Run ID for the specific job run
  • List of artifact paths to retrieve
  • Optional step number for multi-step jobs

Output

The utility should return the downloaded artifacts in a usable format that maintains the association between artifact paths and their contents.

Test Cases

  • Given a valid run ID and a list of 3 artifact paths, all 3 artifacts are downloaded and their contents are returned. @test
  • Given a run ID and a list containing both valid and invalid artifact paths, valid artifacts are downloaded successfully while invalid paths are handled appropriately. @test
  • Given an empty list of artifact paths, the function completes without errors and returns an empty result. @test

Implementation

@generates

API

async def download_artifacts_concurrently(
    account_id: int,
    run_id: int,
    artifact_paths: list[str],
    step_number: int | None = None
) -> dict[str, str]:
    """
    Download multiple artifacts from a dbt Cloud job run concurrently.

    Args:
        account_id: The dbt Cloud account ID
        run_id: The job run ID
        artifact_paths: List of artifact paths to download (e.g., ["manifest.json", "run_results.json"])
        step_number: Optional step number for multi-step jobs

    Returns:
        Dictionary mapping artifact paths to their contents

    Raises:
        ValueError: If account_id or run_id is invalid
    """
    pass

Dependencies { .dependencies }

apache-airflow-providers-dbt-cloud { .dependency }

Provides integration with dbt Cloud API for artifact retrieval.

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/apache-airflow-providers-dbt-cloud@4.4.x
tile.json