or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/azure-storage-file-datalake@12.21.x
tile.json

tessl/pypi-azure-storage-file-datalake

tessl install tessl/pypi-azure-storage-file-datalake@12.21.0

Microsoft Azure File DataLake Storage Client Library for Python

Agent Success

Agent success rate when using this tile

92%

Improvement

Agent success rate improvement when using this tile compared to baseline

0.99x

Baseline

Agent success rate without this tile

93%

task.mdevals/scenario-3/

DataLake Path Analyzer

A utility for analyzing and reporting on path structures in Azure Data Lake Storage Gen2 file systems. This tool helps users understand their storage organization by providing insights into directory structures, file distributions, and path hierarchies.

Capabilities

Recursive Path Listing

List all paths (files and directories) within a file system recursively, capturing complete hierarchy information.

  • Given a file system with nested directories and files, listing paths recursively returns all paths including subdirectories @test
  • When listing recursively from the root with no path filter, all paths in the entire file system are returned @test

Non-Recursive Path Listing

List only top-level paths within a specified directory without traversing subdirectories.

  • Given a directory with files and subdirectories, listing paths non-recursively returns only immediate children @test
  • When a directory has no contents, non-recursive listing returns an empty result @test

Path Filtering by Prefix

Filter paths by a specific prefix to narrow down results to a particular subdirectory or path pattern.

  • Given a file system with multiple directories, listing paths with a specific path prefix returns only matching paths @test
  • When the path prefix doesn't match any existing paths, listing returns an empty result @test

Path Type Identification

Distinguish between directories and files in the listed results.

  • Listed paths correctly identify directories using the is_directory property @test
  • Listed paths correctly identify files by the absence of the is_directory flag @test

Implementation

@generates

API

from typing import List, Dict, Any

def analyze_file_system(
    fs_client,
    path_prefix: str = None,
    recursive: bool = True
) -> Dict[str, Any]:
    """
    Analyze path structure in a file system and return statistics.

    Args:
        fs_client: An initialized FileSystemClient instance
        path_prefix: Optional path prefix to filter results
        recursive: Whether to list paths recursively (default: True)

    Returns:
        dict: Dictionary containing:
            - total_paths: Total number of paths found
            - total_files: Number of files
            - total_directories: Number of directories
            - paths: List of path names
    """

def list_directory_contents(
    directory_client,
    recursive: bool = False,
    max_results: int = None
) -> List[str]:
    """
    List contents of a specific directory.

    Args:
        directory_client: An initialized DataLakeDirectoryClient instance
        recursive: Whether to list recursively through subdirectories
        max_results: Optional maximum number of results per page

    Returns:
        list: List of path names within the directory
    """

def filter_paths_by_type(
    fs_client,
    path_type: str,
    path_prefix: str = None
) -> List[str]:
    """
    Filter and return paths by their type (file or directory).

    Args:
        fs_client: An initialized FileSystemClient instance
        path_type: Type to filter by - either "file" or "directory"
        path_prefix: Optional path prefix to narrow results

    Returns:
        list: List of path names matching the specified type
    """

Dependencies { .dependencies }

azure-storage-file-datalake { .dependency }

Provides Azure Data Lake Storage Gen2 client library for Python, including path listing and enumeration capabilities.