or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

agent-framework.mdcore-framework.mddocument-processing.mddocument-stores.mdevaluation.mdindex.mdprompt-building.mdretrieval.mdtext-embeddings.mdtext-generation.md
tile.json

tessl/pypi-haystack-ai

LLM framework to build customizable, production-ready LLM applications.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/haystack-ai@2.17.x

To install, run

npx @tessl/cli install tessl/pypi-haystack-ai@2.17.0

index.mddocs/

Haystack-AI

A comprehensive end-to-end LLM framework for building production-ready applications powered by large language models, transformer models, and vector search capabilities. Haystack enables developers to perform retrieval-augmented generation (RAG), document search, question answering, and answer generation by orchestrating state-of-the-art embedding models and LLMs into flexible pipelines.

Package Information

  • Package Name: haystack-ai
  • Language: Python
  • Installation: pip install haystack-ai

Core Imports

import haystack

Main components:

from haystack import Pipeline, Document, component
from haystack.components.generators import OpenAIGenerator
from haystack.components.embedders import OpenAITextEmbedder
from haystack.components.retrievers import InMemoryEmbeddingRetriever

Basic Usage

from haystack import Pipeline, Document, component
from haystack.components.generators import OpenAIGenerator
from haystack.components.builders import PromptBuilder
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers import InMemoryEmbeddingRetriever
from haystack.components.embedders import OpenAITextEmbedder, OpenAIDocumentEmbedder

# Create a simple RAG pipeline
documents = [
    Document(content="Python is a programming language."),
    Document(content="Berlin is the capital of Germany."),
    Document(content="Pipelines connect components in Haystack.")
]

# Initialize document store and components
document_store = InMemoryDocumentStore()

# Create pipeline
rag_pipeline = Pipeline()

# Add components
rag_pipeline.add_component("text_embedder", OpenAITextEmbedder())
rag_pipeline.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store))
rag_pipeline.add_component("prompt_builder", PromptBuilder(template="Answer the question based on the context: {{query}} Context: {{documents}}"))
rag_pipeline.add_component("generator", OpenAIGenerator())

# Connect components
rag_pipeline.connect("text_embedder.embedding", "retriever.query_embedding")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "generator.prompt")

# Embed and store documents
doc_embedder = OpenAIDocumentEmbedder()
embedded_docs = doc_embedder.run(documents=documents)
document_store.write_documents(embedded_docs["documents"])

# Run the pipeline
response = rag_pipeline.run({
    "text_embedder": {"text": "What is Python?"},
    "prompt_builder": {"query": "What is Python?"}
})

print(response["generator"]["replies"][0])

Architecture

Haystack follows a modular, component-based architecture:

  • Pipeline: Orchestrates the flow of data between components using a directed acyclic graph (DAG)
  • Components: Modular building blocks that perform specific tasks (embedding, generation, retrieval, etc.)
  • Document Stores: Storage systems for documents and embeddings
  • Data Classes: Structured data types (Document, Answer, ChatMessage, etc.) that flow between components

This design enables flexible composition of AI workflows, from simple Q&A systems to complex multi-step reasoning chains and autonomous agents.

Capabilities

Core Framework

Essential framework components for building pipelines, managing data flow, and creating custom components.

class Pipeline:
    def add_component(self, name: str, instance: Any) -> None: ...
    def connect(self, sender: str, receiver: str) -> None: ...
    def run(self, inputs: Dict[str, Any]) -> Dict[str, Any]: ...

class AsyncPipeline:
    async def run(self, inputs: Dict[str, Any]) -> Dict[str, Any]: ...

@component
def my_component() -> None: ...

class Document:
    def __init__(self, content: str, meta: Dict[str, Any] = None): ...

Core Framework

Text Generation

Large language model integrations for text generation, chat completions, and answer synthesis.

class OpenAIGenerator:
    def run(self, prompt: str, **kwargs) -> Dict[str, Any]: ...

class OpenAIChatGenerator:
    def run(self, messages: List[ChatMessage], **kwargs) -> Dict[str, Any]: ...

class HuggingFaceLocalGenerator:
    def run(self, prompt: str, **kwargs) -> Dict[str, Any]: ...

Text Generation

Text Embeddings

Convert text and documents into vector embeddings for semantic search and retrieval.

class OpenAITextEmbedder:
    def run(self, text: str) -> Dict[str, List[float]]: ...

class OpenAIDocumentEmbedder:
    def run(self, documents: List[Document]) -> Dict[str, List[Document]]: ...

class SentenceTransformersTextEmbedder:
    def run(self, text: str) -> Dict[str, List[float]]: ...

Text Embeddings

Document Processing

Convert various file formats to Haystack Document objects and preprocess text for optimal retrieval.

class PyPDFToDocument:
    def run(self, sources: List[str]) -> Dict[str, List[Document]]: ...

class HTMLToDocument:
    def run(self, sources: List[str]) -> Dict[str, List[Document]]: ...

class DocumentSplitter:
    def run(self, documents: List[Document]) -> Dict[str, List[Document]]: ...

Document Processing

Retrieval

Search and retrieve relevant documents using various retrieval strategies.

class InMemoryEmbeddingRetriever:
    def run(self, query_embedding: List[float], top_k: int = 10) -> Dict[str, List[Document]]: ...

class InMemoryBM25Retriever:
    def run(self, query: str, top_k: int = 10) -> Dict[str, List[Document]]: ...

class FilterRetriever:
    def run(self, filters: Dict[str, Any]) -> Dict[str, List[Document]]: ...

Retrieval

Prompt Building

Create and format prompts for language models with dynamic content injection.

class PromptBuilder:
    def run(self, **kwargs) -> Dict[str, str]: ...

class ChatPromptBuilder:
    def run(self, **kwargs) -> Dict[str, List[ChatMessage]]: ...

Prompt Building

Document Stores

Storage backends for documents and embeddings with filtering and search capabilities.

class InMemoryDocumentStore:
    def write_documents(self, documents: List[Document]) -> int: ...
    def filter_documents(self, filters: Dict[str, Any]) -> List[Document]: ...
    def count_documents(self) -> int: ...

Document Stores

Evaluation

Metrics and evaluation components for assessing pipeline performance and answer quality.

class ContextRelevanceEvaluator:
    def run(self, questions: List[str], contexts: List[List[str]]) -> Dict[str, List[float]]: ...

class FaithfulnessEvaluator:
    def run(self, questions: List[str], contexts: List[List[str]], responses: List[str]) -> Dict[str, List[float]]: ...

Evaluation

Agent Framework

Build autonomous agents that can use tools and maintain conversation state.

class Agent:
    def run(self, messages: List[ChatMessage]) -> Dict[str, List[ChatMessage]]: ...

class ToolInvoker:
    def run(self, tool_calls: List[ToolCall]) -> Dict[str, List[ToolCallResult]]: ...

Agent Framework

Types

class Document:
    content: str
    meta: Dict[str, Any]
    id: str
    score: Optional[float]
    embedding: Optional[List[float]]

class ChatMessage:
    content: str
    role: ChatRole
    name: Optional[str]
    tool_calls: Optional[List[ToolCall]]
    tool_call_result: Optional[ToolCallResult]

class ChatRole(Enum):
    USER = "user"
    ASSISTANT = "assistant"
    SYSTEM = "system"
    TOOL = "tool"

class GeneratedAnswer:
    data: str
    query: str
    documents: List[Document]
    meta: Dict[str, Any]

class ExtractedAnswer:
    query: str
    score: Optional[float]
    data: str
    document: Optional[Document]
    context: Optional[str]
    offsets_in_document: List[Span]
    offsets_in_context: List[Span]
    meta: Dict[str, Any]

class ToolCall:
    tool_name: str
    arguments: Dict[str, Any]
    id: Optional[str]

class ToolCallResult:
    result: str
    origin: ToolCall
    error: bool