or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/deeplake@4.3.x
tile.json

tessl/pypi-deeplake

tessl install tessl/pypi-deeplake@4.3.0

Database for AI powered by a storage format optimized for deep-learning applications.

Agent Success

Agent success rate when using this tile

75%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.6x

Baseline

Agent success rate without this tile

47%

task.mdevals/scenario-5/

Multi-Environment Dataset Manager

Build a dataset management system that can work seamlessly across different storage environments including local filesystem and cloud storage.

Requirements

Your system must support the following operations:

  1. Dataset Initialization: Create datasets in different storage locations based on URL patterns:

    • Local filesystem paths (e.g., ./local_data, /tmp/datasets)
    • Cloud storage paths (e.g., s3://bucket/path, gcs://bucket/path, azure://container/path)
  2. Dataset Existence Check: Before creating a dataset, verify if it already exists at a given location to avoid overwriting existing data.

  3. Dataset Operations: Implement basic operations that work consistently regardless of storage backend:

    • Create a new dataset with a simple schema (at least 2 columns: one for text and one for numeric data)
    • Add sample data to the dataset (at least 3 rows)
    • Save/commit changes to persistent storage
  4. Cross-Storage Migration: Copy an existing dataset from one storage location to another (e.g., from local to cloud or vice versa).

  5. Dataset Cleanup: Remove datasets from storage when no longer needed.

Test Cases

  • Given a local path ./test_dataset, create a dataset with columns text (text type) and value (integer type), add 3 sample rows with data, and verify the dataset exists at that location @test

  • Given a dataset exists at ./source_dataset, copy it to ./destination_dataset and verify both datasets exist and contain the same data @test

  • Given a dataset path ./cleanup_test, create a dataset, verify it exists, then delete it and verify it no longer exists @test

Implementation

@generates

API

class DatasetManager:
    """Manages datasets across different storage backends."""

    def dataset_exists(self, path: str) -> bool:
        """Check if a dataset exists at the given path."""
        pass

    def create_dataset(self, path: str) -> None:
        """Create a new dataset with text and value columns at the specified path."""
        pass

    def add_sample_data(self, path: str) -> None:
        """Add 3 sample rows to the dataset."""
        pass

    def copy_dataset(self, source_path: str, destination_path: str) -> None:
        """Copy dataset from source to destination."""
        pass

    def delete_dataset(self, path: str) -> None:
        """Delete the dataset at the given path."""
        pass

Dependencies { .dependencies }

deeplake { .dependency }

Provides dataset storage and management with multi-cloud support.