or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/deeplake@4.3.x
tile.json

tessl/pypi-deeplake

tessl install tessl/pypi-deeplake@4.3.0

Database for AI powered by a storage format optimized for deep-learning applications.

Agent Success

Agent success rate when using this tile

75%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.6x

Baseline

Agent success rate without this tile

47%

task.mdevals/scenario-6/

Concurrent Dataset Manager

Build a dataset management service that handles multiple dataset operations concurrently.

Requirements

Create a Python module that provides a concurrent dataset manager with the following functionality:

  1. Async Dataset Creation: Create multiple datasets concurrently without blocking
  2. Async Query Execution: Execute queries on multiple datasets without blocking
  3. Result Management: Wait for async operations to complete and retrieve results
  4. Operation Cancellation: Cancel pending async operations when needed

The module should leverage async operations to manage multiple datasets efficiently.

Specifications

Input/Output

create_datasets_async(paths: list[str]) -> list

  • Takes a list of dataset paths
  • Returns a list of Future objects for each dataset creation operation
  • Each dataset should be created asynchronously

query_datasets_async(query: str, paths: list[str]) -> list

  • Takes a TQL query string and list of dataset paths
  • Returns a list of Future objects for each query execution
  • Queries should execute asynchronously without blocking

wait_for_operations(futures: list) -> list

  • Takes a list of Future objects
  • Waits for all operations to complete
  • Returns a list of results from the completed operations

cancel_operation(future) -> bool

  • Takes a Future object
  • Attempts to cancel the operation
  • Returns True if cancellation succeeded, False otherwise

Test Cases

  • Creating 3 datasets at paths ["./ds1", "./ds2", "./ds3"] asynchronously returns 3 Future objects @test
  • Querying 2 datasets with "SELECT * WHERE id > 0" returns 2 Future objects @test
  • Checking if a Future operation is completed before calling result() returns True after completion @test
  • Cancelling a Future operation that hasn't started yet succeeds @test

Implementation

@generates

API

def create_datasets_async(paths: list[str]) -> list:
    """
    Create multiple datasets asynchronously.

    Args:
        paths: List of dataset paths to create

    Returns:
        List of Future objects for each dataset creation
    """
    pass

def query_datasets_async(query: str, paths: list[str]) -> list:
    """
    Execute queries on multiple datasets asynchronously.

    Args:
        query: TQL query string to execute
        paths: List of dataset paths to query

    Returns:
        List of Future objects for each query execution
    """
    pass

def wait_for_operations(futures: list) -> list:
    """
    Wait for all async operations to complete.

    Args:
        futures: List of Future objects to wait for

    Returns:
        List of results from completed operations
    """
    pass

def cancel_operation(future) -> bool:
    """
    Attempt to cancel an async operation.

    Args:
        future: Future object to cancel

    Returns:
        True if cancellation succeeded, False otherwise
    """
    pass

Dependencies { .dependencies }

deeplake { .dependency }

Provides async dataset operations and concurrency support.