tessl/pypi-langchain-chroma

An integration package connecting Chroma and LangChain for vector database operations.

—

Pending

Overview

Eval results

Files

Document Management

Name: tessl/pypi-langchain-chroma
Author: tessl

Core document operations for managing text and image documents in the Chroma vector store. Supports adding, updating, and deleting documents with metadata and automatic ID generation.

Capabilities

Adding Text Documents

Add text documents to the vector store with optional metadata and custom IDs. Documents are automatically embedded using the configured embedding function.

def add_texts(
    texts: Iterable[str], 
    metadatas: Optional[list[dict]] = None, 
    ids: Optional[list[str]] = None, 
    **kwargs: Any
) -> list[str]:
    """
    Add texts to the vector store.
    
    Parameters:
    - texts: Iterable of text strings to add
    - metadatas: Optional list of metadata dictionaries for each text
    - ids: Optional list of custom IDs (UUIDs generated if not provided)
    - **kwargs: Additional keyword arguments
    
    Returns:
    List of document IDs that were added
    
    Raises:
    ValueError: When metadata format is incorrect
    """

def add_documents(
    documents: list[Document], 
    ids: Optional[list[str]] = None, 
    **kwargs: Any
) -> list[str]:
    """
    Add Document objects to the vector store.
    
    Parameters:
    - documents: List of Document objects to add
    - ids: Optional list of custom IDs (uses document.id or generates UUIDs)
    - **kwargs: Additional keyword arguments
    
    Returns:
    List of document IDs that were added
    """

Usage Example:

# Add texts with metadata
texts = ["Hello world", "Python is great", "AI is fascinating"]
metadatas = [
    {"source": "greeting", "category": "social"},
    {"source": "programming", "category": "tech"},
    {"source": "ai", "category": "tech"}
]
ids = vector_store.add_texts(texts, metadatas=metadatas)

# Add Document objects
from langchain_core.documents import Document
documents = [
    Document(page_content="Machine Learning", metadata={"topic": "AI"}),
    Document(page_content="Deep Learning", metadata={"topic": "AI"})
]
doc_ids = vector_store.add_documents(documents)

Adding Image Documents

Add images to the vector store using file URIs. Requires an embedding function that supports image embeddings.

def add_images(
    uris: list[str], 
    metadatas: Optional[list[dict]] = None, 
    ids: Optional[list[str]] = None
) -> list[str]:
    """
    Add images to the vector store.
    
    Parameters:
    - uris: List of file paths to images
    - metadatas: Optional list of metadata dictionaries for each image
    - ids: Optional list of custom IDs (UUIDs generated if not provided)
    
    Returns:
    List of document IDs that were added
    
    Raises:
    ValueError: When metadata format is incorrect or embedding function doesn't support images
    """

Usage Example:

# Add images (requires embedding function with image support)
image_paths = ["/path/to/image1.jpg", "/path/to/image2.png"]
metadatas = [{"type": "photo"}, {"type": "diagram"}]
image_ids = vector_store.add_images(image_paths, metadatas=metadatas)

Updating Documents

Update existing documents in the vector store by their IDs.

def update_document(document_id: str, document: Document) -> None:
    """
    Update a single document in the collection.
    
    Parameters:
    - document_id: ID of the document to update
    - document: New Document object to replace the existing one
    
    Raises:
    ValueError: If embedding function is not provided
    """

def update_documents(ids: list[str], documents: list[Document]) -> None:
    """
    Update multiple documents in the collection.
    
    Parameters:
    - ids: List of document IDs to update
    - documents: List of new Document objects
    
    Raises:
    ValueError: If embedding function is not provided
    """

Usage Example:

# Update a single document
updated_doc = Document(
    page_content="Updated content", 
    metadata={"status": "revised"}
)
vector_store.update_document("doc_id_123", updated_doc)

# Update multiple documents
updated_docs = [
    Document(page_content="New content 1", metadata={"version": 2}),
    Document(page_content="New content 2", metadata={"version": 2})
]
vector_store.update_documents(["id_1", "id_2"], updated_docs)

Deleting Documents

Remove documents from the vector store by their IDs.

def delete(ids: Optional[list[str]] = None, **kwargs: Any) -> None:
    """
    Delete documents from the vector store.
    
    Parameters:
    - ids: List of document IDs to delete
    - **kwargs: Additional keyword arguments passed to ChromaDB
    """

Usage Example:

# Delete specific documents
vector_store.delete(ids=["doc_id_1", "doc_id_2"])

# Delete with additional ChromaDB parameters
vector_store.delete(ids=["doc_id_3"], where={"category": "obsolete"})

Utility Functions

Image Encoding

Static method for encoding images to base64 strings.

@staticmethod
def encode_image(uri: str) -> str:
    """
    Encode an image file to base64 string.
    
    Parameters:
    - uri: File path to the image
    
    Returns:
    Base64 encoded string representation of the image
    """

Usage Example:

# Encode image for manual processing
encoded_image = Chroma.encode_image("/path/to/image.jpg")

Install with Tessl CLI

npx tessl i tessl/pypi-langchain-chroma

docs

collection-management.md

construction.md

document-management.md

tessl/pypi-langchain-chroma