Microsoft Azure File DataLake Storage Client Library for Python
Overall
score
92%
Build a utility that processes data files stored in Azure Data Lake Storage Gen2 by executing SQL-like queries and converting between different data formats.
Your solution should implement a function that:
The query processor should work with data files that contain structured records with fields like customer information, order details, or log entries.
@generates
from typing import Iterator, Dict, Any, Optional
from azure.storage.filedatalake import DataLakeFileClient
def query_file_to_json(
file_client: DataLakeFileClient,
query_expression: str,
input_format: str = "csv",
csv_delimiter: str = ",",
csv_has_header: bool = True
) -> Iterator[Dict[str, Any]]:
"""
Execute a SQL-like query on a data file in Azure Data Lake Storage.
Args:
file_client: The DataLakeFileClient pointing to the file to query
query_expression: SQL SELECT statement to execute on the file
input_format: Input format of the file ("csv" or "json")
csv_delimiter: Delimiter character for CSV files
csv_has_header: Whether CSV file has a header row
Returns:
Iterator of dictionaries containing the query results
Raises:
ValueError: If query_expression is invalid or input_format is unsupported
"""
passProvides Azure Data Lake Storage Gen2 client functionality including file query capabilities.
@satisfied-by
Install with Tessl CLI
npx tessl i tessl/pypi-azure-storage-file-datalakedocs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10