or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-haystack-ai

LLM framework to build customizable, production-ready LLM applications.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/haystack-ai@2.17.x

To install, run

npx @tessl/cli install tessl/pypi-haystack-ai@2.17.0

0

# Haystack-AI

1

2

A comprehensive end-to-end LLM framework for building production-ready applications powered by large language models, transformer models, and vector search capabilities. Haystack enables developers to perform retrieval-augmented generation (RAG), document search, question answering, and answer generation by orchestrating state-of-the-art embedding models and LLMs into flexible pipelines.

3

4

## Package Information

5

6

- **Package Name**: haystack-ai

7

- **Language**: Python

8

- **Installation**: `pip install haystack-ai`

9

10

## Core Imports

11

12

```python

13

import haystack

14

```

15

16

Main components:

17

18

```python

19

from haystack import Pipeline, Document, component

20

from haystack.components.generators import OpenAIGenerator

21

from haystack.components.embedders import OpenAITextEmbedder

22

from haystack.components.retrievers import InMemoryEmbeddingRetriever

23

```

24

25

## Basic Usage

26

27

```python

28

from haystack import Pipeline, Document, component

29

from haystack.components.generators import OpenAIGenerator

30

from haystack.components.builders import PromptBuilder

31

from haystack.document_stores.in_memory import InMemoryDocumentStore

32

from haystack.components.retrievers import InMemoryEmbeddingRetriever

33

from haystack.components.embedders import OpenAITextEmbedder, OpenAIDocumentEmbedder

34

35

# Create a simple RAG pipeline

36

documents = [

37

Document(content="Python is a programming language."),

38

Document(content="Berlin is the capital of Germany."),

39

Document(content="Pipelines connect components in Haystack.")

40

]

41

42

# Initialize document store and components

43

document_store = InMemoryDocumentStore()

44

45

# Create pipeline

46

rag_pipeline = Pipeline()

47

48

# Add components

49

rag_pipeline.add_component("text_embedder", OpenAITextEmbedder())

50

rag_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))

51

rag_pipeline.add_component("prompt_builder", PromptBuilder(template="Answer the question based on the context: {{query}} Context: {{documents}}"))

52

rag_pipeline.add_component("generator", OpenAIGenerator())

53

54

# Connect components

55

rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")

56

rag_pipeline.connect("retriever.documents", "prompt_builder.documents")

57

rag_pipeline.connect("prompt_builder.prompt", "generator.prompt")

58

59

# Embed and store documents

60

doc_embedder = OpenAIDocumentEmbedder()

61

embedded_docs = doc_embedder.run(documents=documents)

62

document_store.write_documents(embedded_docs["documents"])

63

64

# Run the pipeline

65

response = rag_pipeline.run({

66

"text_embedder": {"text": "What is Python?"},

67

"prompt_builder": {"query": "What is Python?"}

68

})

69

70

print(response["generator"]["replies"][0])

71

```

72

73

## Architecture

74

75

Haystack follows a modular, component-based architecture:

76

77

- **Pipeline**: Orchestrates the flow of data between components using a directed acyclic graph (DAG)

78

- **Components**: Modular building blocks that perform specific tasks (embedding, generation, retrieval, etc.)

79

- **Document Stores**: Storage systems for documents and embeddings

80

- **Data Classes**: Structured data types (Document, Answer, ChatMessage, etc.) that flow between components

81

82

This design enables flexible composition of AI workflows, from simple Q&A systems to complex multi-step reasoning chains and autonomous agents.

83

84

## Capabilities

85

86

### Core Framework

87

88

Essential framework components for building pipelines, managing data flow, and creating custom components.

89

90

```python { .api }

91

class Pipeline:

92

def add_component(self, name: str, instance: Any) -> None: ...

93

def connect(self, sender: str, receiver: str) -> None: ...

94

def run(self, inputs: Dict[str, Any]) -> Dict[str, Any]: ...

95

96

class AsyncPipeline:

97

async def run(self, inputs: Dict[str, Any]) -> Dict[str, Any]: ...

98

99

@component

100

def my_component() -> None: ...

101

102

class Document:

103

def __init__(self, content: str, meta: Dict[str, Any] = None): ...

104

```

105

106

[Core Framework](./core-framework.md)

107

108

### Text Generation

109

110

Large language model integrations for text generation, chat completions, and answer synthesis.

111

112

```python { .api }

113

class OpenAIGenerator:

114

def run(self, prompt: str, **kwargs) -> Dict[str, Any]: ...

115

116

class OpenAIChatGenerator:

117

def run(self, messages: List[ChatMessage], **kwargs) -> Dict[str, Any]: ...

118

119

class HuggingFaceLocalGenerator:

120

def run(self, prompt: str, **kwargs) -> Dict[str, Any]: ...

121

```

122

123

[Text Generation](./text-generation.md)

124

125

### Text Embeddings

126

127

Convert text and documents into vector embeddings for semantic search and retrieval.

128

129

```python { .api }

130

class OpenAITextEmbedder:

131

def run(self, text: str) -> Dict[str, List[float]]: ...

132

133

class OpenAIDocumentEmbedder:

134

def run(self, documents: List[Document]) -> Dict[str, List[Document]]: ...

135

136

class SentenceTransformersTextEmbedder:

137

def run(self, text: str) -> Dict[str, List[float]]: ...

138

```

139

140

[Text Embeddings](./text-embeddings.md)

141

142

### Document Processing

143

144

Convert various file formats to Haystack Document objects and preprocess text for optimal retrieval.

145

146

```python { .api }

147

class PyPDFToDocument:

148

def run(self, sources: List[str]) -> Dict[str, List[Document]]: ...

149

150

class HTMLToDocument:

151

def run(self, sources: List[str]) -> Dict[str, List[Document]]: ...

152

153

class DocumentSplitter:

154

def run(self, documents: List[Document]) -> Dict[str, List[Document]]: ...

155

```

156

157

[Document Processing](./document-processing.md)

158

159

### Retrieval

160

161

Search and retrieve relevant documents using various retrieval strategies.

162

163

```python { .api }

164

class InMemoryEmbeddingRetriever:

165

def run(self, query_embedding: List[float], top_k: int = 10) -> Dict[str, List[Document]]: ...

166

167

class InMemoryBM25Retriever:

168

def run(self, query: str, top_k: int = 10) -> Dict[str, List[Document]]: ...

169

170

class FilterRetriever:

171

def run(self, filters: Dict[str, Any]) -> Dict[str, List[Document]]: ...

172

```

173

174

[Retrieval](./retrieval.md)

175

176

### Prompt Building

177

178

Create and format prompts for language models with dynamic content injection.

179

180

```python { .api }

181

class PromptBuilder:

182

def run(self, **kwargs) -> Dict[str, str]: ...

183

184

class ChatPromptBuilder:

185

def run(self, **kwargs) -> Dict[str, List[ChatMessage]]: ...

186

```

187

188

[Prompt Building](./prompt-building.md)

189

190

### Document Stores

191

192

Storage backends for documents and embeddings with filtering and search capabilities.

193

194

```python { .api }

195

class InMemoryDocumentStore:

196

def write_documents(self, documents: List[Document]) -> int: ...

197

def filter_documents(self, filters: Dict[str, Any]) -> List[Document]: ...

198

def count_documents(self) -> int: ...

199

```

200

201

[Document Stores](./document-stores.md)

202

203

### Evaluation

204

205

Metrics and evaluation components for assessing pipeline performance and answer quality.

206

207

```python { .api }

208

class ContextRelevanceEvaluator:

209

def run(self, questions: List[str], contexts: List[List[str]]) -> Dict[str, List[float]]: ...

210

211

class FaithfulnessEvaluator:

212

def run(self, questions: List[str], contexts: List[List[str]], responses: List[str]) -> Dict[str, List[float]]: ...

213

```

214

215

[Evaluation](./evaluation.md)

216

217

### Agent Framework

218

219

Build autonomous agents that can use tools and maintain conversation state.

220

221

```python { .api }

222

class Agent:

223

def run(self, messages: List[ChatMessage]) -> Dict[str, List[ChatMessage]]: ...

224

225

class ToolInvoker:

226

def run(self, tool_calls: List[ToolCall]) -> Dict[str, List[ToolCallResult]]: ...

227

```

228

229

[Agent Framework](./agent-framework.md)

230

231

## Types

232

233

```python { .api }

234

class Document:

235

content: str

236

meta: Dict[str, Any]

237

id: str

238

score: Optional[float]

239

embedding: Optional[List[float]]

240

241

class ChatMessage:

242

content: str

243

role: ChatRole

244

name: Optional[str]

245

tool_calls: Optional[List[ToolCall]]

246

tool_call_result: Optional[ToolCallResult]

247

248

class ChatRole(Enum):

249

USER = "user"

250

ASSISTANT = "assistant"

251

SYSTEM = "system"

252

TOOL = "tool"

253

254

class GeneratedAnswer:

255

data: str

256

query: str

257

documents: List[Document]

258

meta: Dict[str, Any]

259

260

class ExtractedAnswer:

261

query: str

262

score: Optional[float]

263

data: str

264

document: Optional[Document]

265

context: Optional[str]

266

offsets_in_document: List[Span]

267

offsets_in_context: List[Span]

268

meta: Dict[str, Any]

269

270

class ToolCall:

271

tool_name: str

272

arguments: Dict[str, Any]

273

id: Optional[str]

274

275

class ToolCallResult:

276

result: str

277

origin: ToolCall

278

error: bool

279

```