or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

collection-management.mdconstruction.mddocument-management.mdindex.mdmmr.mdsearch-operations.md

document-management.mddocs/

0

# Document Management

1

2

Core document operations for managing text and image documents in the Chroma vector store. Supports adding, updating, and deleting documents with metadata and automatic ID generation.

3

4

## Capabilities

5

6

### Adding Text Documents

7

8

Add text documents to the vector store with optional metadata and custom IDs. Documents are automatically embedded using the configured embedding function.

9

10

```python { .api }

11

def add_texts(

12

texts: Iterable[str],

13

metadatas: Optional[list[dict]] = None,

14

ids: Optional[list[str]] = None,

15

**kwargs: Any

16

) -> list[str]:

17

"""

18

Add texts to the vector store.

19

20

Parameters:

21

- texts: Iterable of text strings to add

22

- metadatas: Optional list of metadata dictionaries for each text

23

- ids: Optional list of custom IDs (UUIDs generated if not provided)

24

- **kwargs: Additional keyword arguments

25

26

Returns:

27

List of document IDs that were added

28

29

Raises:

30

ValueError: When metadata format is incorrect

31

"""

32

33

def add_documents(

34

documents: list[Document],

35

ids: Optional[list[str]] = None,

36

**kwargs: Any

37

) -> list[str]:

38

"""

39

Add Document objects to the vector store.

40

41

Parameters:

42

- documents: List of Document objects to add

43

- ids: Optional list of custom IDs (uses document.id or generates UUIDs)

44

- **kwargs: Additional keyword arguments

45

46

Returns:

47

List of document IDs that were added

48

"""

49

```

50

51

**Usage Example:**

52

```python

53

# Add texts with metadata

54

texts = ["Hello world", "Python is great", "AI is fascinating"]

55

metadatas = [

56

{"source": "greeting", "category": "social"},

57

{"source": "programming", "category": "tech"},

58

{"source": "ai", "category": "tech"}

59

]

60

ids = vector_store.add_texts(texts, metadatas=metadatas)

61

62

# Add Document objects

63

from langchain_core.documents import Document

64

documents = [

65

Document(page_content="Machine Learning", metadata={"topic": "AI"}),

66

Document(page_content="Deep Learning", metadata={"topic": "AI"})

67

]

68

doc_ids = vector_store.add_documents(documents)

69

```

70

71

### Adding Image Documents

72

73

Add images to the vector store using file URIs. Requires an embedding function that supports image embeddings.

74

75

```python { .api }

76

def add_images(

77

uris: list[str],

78

metadatas: Optional[list[dict]] = None,

79

ids: Optional[list[str]] = None

80

) -> list[str]:

81

"""

82

Add images to the vector store.

83

84

Parameters:

85

- uris: List of file paths to images

86

- metadatas: Optional list of metadata dictionaries for each image

87

- ids: Optional list of custom IDs (UUIDs generated if not provided)

88

89

Returns:

90

List of document IDs that were added

91

92

Raises:

93

ValueError: When metadata format is incorrect or embedding function doesn't support images

94

"""

95

```

96

97

**Usage Example:**

98

```python

99

# Add images (requires embedding function with image support)

100

image_paths = ["/path/to/image1.jpg", "/path/to/image2.png"]

101

metadatas = [{"type": "photo"}, {"type": "diagram"}]

102

image_ids = vector_store.add_images(image_paths, metadatas=metadatas)

103

```

104

105

### Updating Documents

106

107

Update existing documents in the vector store by their IDs.

108

109

```python { .api }

110

def update_document(document_id: str, document: Document) -> None:

111

"""

112

Update a single document in the collection.

113

114

Parameters:

115

- document_id: ID of the document to update

116

- document: New Document object to replace the existing one

117

118

Raises:

119

ValueError: If embedding function is not provided

120

"""

121

122

def update_documents(ids: list[str], documents: list[Document]) -> None:

123

"""

124

Update multiple documents in the collection.

125

126

Parameters:

127

- ids: List of document IDs to update

128

- documents: List of new Document objects

129

130

Raises:

131

ValueError: If embedding function is not provided

132

"""

133

```

134

135

**Usage Example:**

136

```python

137

# Update a single document

138

updated_doc = Document(

139

page_content="Updated content",

140

metadata={"status": "revised"}

141

)

142

vector_store.update_document("doc_id_123", updated_doc)

143

144

# Update multiple documents

145

updated_docs = [

146

Document(page_content="New content 1", metadata={"version": 2}),

147

Document(page_content="New content 2", metadata={"version": 2})

148

]

149

vector_store.update_documents(["id_1", "id_2"], updated_docs)

150

```

151

152

### Deleting Documents

153

154

Remove documents from the vector store by their IDs.

155

156

```python { .api }

157

def delete(ids: Optional[list[str]] = None, **kwargs: Any) -> None:

158

"""

159

Delete documents from the vector store.

160

161

Parameters:

162

- ids: List of document IDs to delete

163

- **kwargs: Additional keyword arguments passed to ChromaDB

164

"""

165

```

166

167

**Usage Example:**

168

```python

169

# Delete specific documents

170

vector_store.delete(ids=["doc_id_1", "doc_id_2"])

171

172

# Delete with additional ChromaDB parameters

173

vector_store.delete(ids=["doc_id_3"], where={"category": "obsolete"})

174

```

175

176

## Utility Functions

177

178

### Image Encoding

179

180

Static method for encoding images to base64 strings.

181

182

```python { .api }

183

@staticmethod

184

def encode_image(uri: str) -> str:

185

"""

186

Encode an image file to base64 string.

187

188

Parameters:

189

- uri: File path to the image

190

191

Returns:

192

Base64 encoded string representation of the image

193

"""

194

```

195

196

**Usage Example:**

197

```python

198

# Encode image for manual processing

199

encoded_image = Chroma.encode_image("/path/to/image.jpg")

200

```