or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/deeplake@4.3.x
tile.json

tessl/pypi-deeplake

tessl install tessl/pypi-deeplake@4.3.0

Database for AI powered by a storage format optimized for deep-learning applications.

Agent Success

Agent success rate when using this tile

75%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.6x

Baseline

Agent success rate without this tile

47%

task.mdevals/scenario-10/

Dataset Version Control System

Build a dataset version control system that manages multiple branches and tags for experimental ML workflows.

Requirements

Your system should:

  1. Create a new dataset with sample data (embeddings and labels)
  2. Create multiple feature branches for parallel experiments
  3. Commit changes on different branches
  4. Tag important milestones
  5. Merge changes from a feature branch back to the main branch
  6. List all branches and tags with their metadata

Specifications

Dataset Structure

  • Create a dataset with two columns:
    • embedding: vector embeddings (arrays of 128 float values)
    • label: text labels
  • Initialize with at least 5 sample records

Branch Management

  • Create a main branch as the default
  • Create at least two feature branches: experiment-1 and experiment-2
  • Each branch should have independent commits

Version Operations

  • Commit changes with descriptive messages on each branch
  • Create tags for important versions (e.g., v1.0, baseline)
  • Merge one feature branch back to main
  • Retrieve and display commit history

Output Requirements

Your program should output:

  1. List of all branches with their names
  2. List of all tags with their names
  3. Confirmation of successful merge operation
  4. Count of commits in the main branch after merge

Test Cases

  • Creating dataset with embeddings and labels, then creating a branch named "experiment-1" succeeds @test
  • Committing changes on a branch stores the commit with the provided message @test
  • Creating a tag named "v1.0" and retrieving it returns the correct tag name @test
  • Merging a feature branch into main combines the commits from both branches @test

Implementation

@generates

API

def setup_dataset(path: str) -> object:
    """
    Create and initialize a dataset with embeddings and labels.

    Args:
        path: Storage path for the dataset

    Returns:
        Dataset object
    """
    pass

def create_branches(dataset: object, branch_names: list) -> None:
    """
    Create multiple feature branches.

    Args:
        dataset: The dataset object
        branch_names: List of branch names to create
    """
    pass

def commit_on_branch(dataset: object, branch_name: str, message: str) -> None:
    """
    Make changes and commit on a specific branch.

    Args:
        dataset: The dataset object
        branch_name: Name of the branch to commit on
        message: Commit message
    """
    pass

def create_tag(dataset: object, tag_name: str) -> None:
    """
    Create a tag for the current version.

    Args:
        dataset: The dataset object
        tag_name: Name for the tag
    """
    pass

def merge_branch(dataset: object, source_branch: str, target_branch: str) -> None:
    """
    Merge source branch into target branch.

    Args:
        dataset: The dataset object
        source_branch: Branch to merge from
        target_branch: Branch to merge into
    """
    pass

def list_branches(dataset: object) -> list:
    """
    List all branches in the dataset.

    Args:
        dataset: The dataset object

    Returns:
        List of branch names
    """
    pass

def list_tags(dataset: object) -> list:
    """
    List all tags in the dataset.

    Args:
        dataset: The dataset object

    Returns:
        List of tag names
    """
    pass

def get_commit_count(dataset: object) -> int:
    """
    Get the number of commits in the current branch.

    Args:
        dataset: The dataset object

    Returns:
        Number of commits
    """
    pass

Dependencies { .dependencies }

deeplake { .dependency }

Provides dataset storage and version control capabilities.