or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/azure-storage-file-datalake@12.21.x
tile.json

tessl/pypi-azure-storage-file-datalake

tessl install tessl/pypi-azure-storage-file-datalake@12.21.0

Microsoft Azure File DataLake Storage Client Library for Python

Agent Success

Agent success rate when using this tile

92%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.99x

Baseline

Agent success rate without this tile

93%

task.mdevals/scenario-10/

DataLake File Download Utility

Build a utility that downloads files from Azure DataLake Storage Gen2 with support for progress tracking and content validation.

Requirements

Your utility should provide functionality to:

  1. Basic File Download: Download a complete file from a DataLake file system to local storage
  2. Progress Tracking: Report download progress as bytes are transferred
  3. Content Validation: Verify the downloaded file's integrity using content MD5 hash validation
  4. Error Handling: Handle common error scenarios such as missing files or authentication failures

Implementation Details

Create a Python module datalake_downloader.py that implements a DataLakeDownloader class with the following methods:

  • download_file(file_system_name, file_path, local_destination): Downloads a file from the specified file system and path to a local destination
  • download_file_with_progress(file_system_name, file_path, local_destination, progress_callback): Downloads a file and reports progress via a callback function
  • download_file_with_validation(file_system_name, file_path, local_destination): Downloads a file and validates its content integrity

The class should be initialized with appropriate authentication credentials for Azure DataLake Storage.

Test Cases

  • Downloading a simple text file (100 bytes) completes successfully and the content matches the source file @test
  • Downloading a file with progress tracking invokes the progress callback with increasing byte counts @test
  • Downloading with content validation successfully validates MD5 hash of downloaded content @test
  • Attempting to download a non-existent file raises an appropriate exception @test

Implementation

@generates

Dependencies { .dependencies }

azure-storage-file-datalake { .dependency }

Provides Azure DataLake Storage Gen2 client functionality for file operations.

API

from typing import Callable, Optional

class DataLakeDownloader:
    """
    A utility for downloading files from Azure DataLake Storage Gen2.
    """

    def __init__(self, account_url: str, credential):
        """
        Initialize the downloader with Azure DataLake Storage credentials.

        Args:
            account_url: The URL to the DataLake storage account
            credential: Authentication credential (can be SAS token, shared key, or Azure AD credential)
        """
        pass

    def download_file(self, file_system_name: str, file_path: str, local_destination: str) -> None:
        """
        Download a file from DataLake Storage to a local destination.

        Args:
            file_system_name: Name of the file system (container) containing the file
            file_path: Path to the file within the file system
            local_destination: Local file path where the file should be saved

        Raises:
            Exception: If the file does not exist or download fails
        """
        pass

    def download_file_with_progress(
        self,
        file_system_name: str,
        file_path: str,
        local_destination: str,
        progress_callback: Callable[[int, int], None]
    ) -> None:
        """
        Download a file with progress tracking.

        Args:
            file_system_name: Name of the file system containing the file
            file_path: Path to the file within the file system
            local_destination: Local file path where the file should be saved
            progress_callback: Callback function that receives (bytes_downloaded, total_bytes)

        Raises:
            Exception: If the file does not exist or download fails
        """
        pass

    def download_file_with_validation(
        self,
        file_system_name: str,
        file_path: str,
        local_destination: str
    ) -> None:
        """
        Download a file and validate its content integrity using MD5 hash.

        Args:
            file_system_name: Name of the file system containing the file
            file_path: Path to the file within the file system
            local_destination: Local file path where the file should be saved

        Raises:
            Exception: If the file does not exist, download fails, or validation fails
        """
        pass