or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

collection-management.mdconstruction.mddocument-management.mdindex.mdmmr.mdsearch-operations.md

index.mddocs/

0

# LangChain Chroma

1

2

An integration package connecting Chroma and LangChain for vector database operations. This package provides a LangChain-compatible interface to ChromaDB, enabling developers to use ChromaDB as a vector store for embedding-based search and retrieval in AI applications, particularly for semantic search, question-answering systems, and retrieval-augmented generation (RAG) pipelines.

3

4

## Package Information

5

6

- **Package Name**: langchain-chroma

7

- **Language**: Python

8

- **Installation**: `pip install langchain-chroma`

9

- **Dependencies**: `chromadb>=1.0.9`, `langchain-core>=0.3.70`, `numpy>=1.26.0`

10

11

## Core Imports

12

13

```python

14

from langchain_chroma import Chroma

15

```

16

17

## Basic Usage

18

19

```python

20

from langchain_chroma import Chroma

21

from langchain_core.documents import Document

22

from langchain_openai import OpenAIEmbeddings

23

24

# Initialize the vector store

25

embeddings = OpenAIEmbeddings()

26

vector_store = Chroma(

27

collection_name="my_collection",

28

embedding_function=embeddings,

29

persist_directory="./chroma_db"

30

)

31

32

# Add documents

33

documents = [

34

Document(page_content="Hello world", metadata={"source": "greeting"}),

35

Document(page_content="Python is great", metadata={"source": "programming"})

36

]

37

vector_store.add_documents(documents)

38

39

# Perform similarity search

40

results = vector_store.similarity_search("programming language", k=2)

41

for doc in results:

42

print(f"Content: {doc.page_content}")

43

print(f"Metadata: {doc.metadata}")

44

```

45

46

## Architecture

47

48

The langchain-chroma package implements the LangChain VectorStore interface with ChromaDB as the backend:

49

50

- **Chroma Class**: Main vector store class that handles all vector operations

51

- **Client Management**: Supports multiple ChromaDB client types (local, persistent, HTTP, cloud)

52

- **Embedding Integration**: Works with any LangChain-compatible embedding function

53

- **Document Management**: Full CRUD operations for documents with metadata support

54

- **Search Operations**: Multiple search modes including similarity search, MMR, and image search

55

56

## Capabilities

57

58

### Document Management

59

60

Core document operations including adding, updating, and deleting documents in the vector store. Supports batch operations and automatic ID generation.

61

62

```python { .api }

63

def add_texts(texts: Iterable[str], metadatas: Optional[list[dict]] = None, ids: Optional[list[str]] = None, **kwargs: Any) -> list[str]

64

def add_documents(documents: list[Document], ids: Optional[list[str]] = None, **kwargs: Any) -> list[str]

65

def add_images(uris: list[str], metadatas: Optional[list[dict]] = None, ids: Optional[list[str]] = None) -> list[str]

66

def update_document(document_id: str, document: Document) -> None

67

def update_documents(ids: list[str], documents: list[Document]) -> None

68

def delete(ids: Optional[list[str]] = None, **kwargs: Any) -> None

69

```

70

71

[Document Management](./document-management.md)

72

73

### Search Operations

74

75

Comprehensive search functionality including similarity search, vector search, and relevance scoring. Supports metadata filtering and document content filtering.

76

77

```python { .api }

78

def similarity_search(query: str, k: int = 4, filter: Optional[dict[str, str]] = None, **kwargs: Any) -> list[Document]

79

def similarity_search_with_score(query: str, k: int = 4, filter: Optional[dict[str, str]] = None, where_document: Optional[dict[str, str]] = None, **kwargs: Any) -> list[tuple[Document, float]]

80

def similarity_search_by_vector(embedding: list[float], k: int = 4, filter: Optional[dict[str, str]] = None, where_document: Optional[dict[str, str]] = None, **kwargs: Any) -> list[Document]

81

def similarity_search_by_image(uri: str, k: int = 4, filter: Optional[dict[str, str]] = None, **kwargs: Any) -> list[Document]

82

```

83

84

[Search Operations](./search-operations.md)

85

86

### Maximum Marginal Relevance

87

88

Advanced search algorithms that optimize for both similarity to query and diversity among results, reducing redundancy in search results.

89

90

```python { .api }

91

def max_marginal_relevance_search(query: str, k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, filter: Optional[dict[str, str]] = None, where_document: Optional[dict[str, str]] = None, **kwargs: Any) -> list[Document]

92

def max_marginal_relevance_search_by_vector(embedding: list[float], k: int = 4, fetch_k: int = 20, lambda_mult: float = 0.5, filter: Optional[dict[str, str]] = None, where_document: Optional[dict[str, str]] = None, **kwargs: Any) -> list[Document]

93

```

94

95

[Maximum Marginal Relevance](./mmr.md)

96

97

### Collection Management

98

99

Collection-level operations for managing the underlying ChromaDB collections, including retrieval, resetting, and deletion.

100

101

```python { .api }

102

def get(ids: Optional[Union[str, list[str]]] = None, where: Optional[Where] = None, limit: Optional[int] = None, offset: Optional[int] = None, where_document: Optional[WhereDocument] = None, include: Optional[list[str]] = None) -> dict[str, Any]

103

def get_by_ids(ids: Sequence[str], /) -> list[Document]

104

def reset_collection() -> None

105

def delete_collection() -> None

106

```

107

108

[Collection Management](./collection-management.md)

109

110

### Vector Store Construction

111

112

Class methods and utilities for creating Chroma instances from various data sources and configurations.

113

114

```python { .api }

115

@classmethod

116

def from_texts(cls: type[Chroma], texts: list[str], embedding: Optional[Embeddings] = None, metadatas: Optional[list[dict]] = None, ids: Optional[list[str]] = None, collection_name: str = "langchain", **kwargs: Any) -> Chroma

117

118

@classmethod

119

def from_documents(cls: type[Chroma], documents: list[Document], embedding: Optional[Embeddings] = None, ids: Optional[list[str]] = None, collection_name: str = "langchain", **kwargs: Any) -> Chroma

120

121

@staticmethod

122

def encode_image(uri: str) -> str

123

```

124

125

[Vector Store Construction](./construction.md)

126

127

## Types

128

129

```python { .api }

130

from typing import Union, Optional, Any, Callable, Iterable

131

from collections.abc import Sequence

132

import numpy as np

133

from langchain_core.documents import Document

134

from langchain_core.embeddings import Embeddings

135

from chromadb.api.types import Where, WhereDocument

136

from chromadb.api import CreateCollectionConfiguration

137

import chromadb

138

139

Matrix = Union[list[list[float]], list[np.ndarray], np.ndarray]

140

141

class Chroma(VectorStore):

142

"""

143

Chroma vector store integration for LangChain.

144

145

Provides a LangChain-compatible interface to ChromaDB for vector storage,

146

similarity search, and document retrieval operations.

147

"""

148

149

def __init__(

150

self,

151

collection_name: str = "langchain",

152

embedding_function: Optional[Embeddings] = None,

153

persist_directory: Optional[str] = None,

154

host: Optional[str] = None,

155

port: Optional[int] = None,

156

headers: Optional[dict[str, str]] = None,

157

chroma_cloud_api_key: Optional[str] = None,

158

tenant: Optional[str] = None,

159

database: Optional[str] = None,

160

client_settings: Optional[chromadb.config.Settings] = None,

161

collection_metadata: Optional[dict] = None,

162

collection_configuration: Optional[CreateCollectionConfiguration] = None,

163

client: Optional[chromadb.ClientAPI] = None,

164

relevance_score_fn: Optional[Callable[[float], float]] = None,

165

create_collection_if_not_exists: Optional[bool] = True,

166

*,

167

ssl: bool = False,

168

) -> None

169

```