Interface between LLMs and your data for building retrieval-augmented generation (RAG) applications
—
LlamaIndex is a comprehensive data framework designed for building Large Language Model (LLM) applications, serving as the bridge between LLMs and various data sources. It provides a unified interface for building Retrieval-Augmented Generation (RAG) systems, enabling developers to connect their LLMs to structured and unstructured data including documents, APIs, databases, and knowledge bases.
The framework offers modular architecture with over 300 integration packages for different LLM providers, embedding models, and vector stores, allowing users to build customized solutions with their preferred technology stack. LlamaIndex simplifies the process of indexing, querying, and retrieving relevant information for LLM applications, supporting various data ingestion methods, advanced retrieval strategies, and sophisticated query engines that can handle complex multi-step reasoning tasks across diverse data sources.
pip install llama-indeximport llama_indexCommon usage patterns for core functionality:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.core.llms import LLM
from llama_index.core.embeddings import BaseEmbeddingIntegration imports follow the pattern:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbeddingfrom llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
# Configure global settings
Settings.llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
# Load documents
documents = SimpleDirectoryReader("data").load_data()
# Create index
index = VectorStoreIndex.from_documents(documents)
# Create query engine
query_engine = index.as_query_engine()
# Query your data
response = query_engine.query("What are the main topics in the documents?")
print(response)LlamaIndex's modular architecture enables flexible RAG application development:
This design allows every component to be customized and provides the foundation for LlamaIndex's role as a comprehensive RAG framework, supporting everything from simple document Q&A to sophisticated multi-agent workflows.
Core data structures for organizing and retrieving information, including vector-based semantic search, keyword extraction, hierarchical trees, and knowledge graphs.
class VectorStoreIndex:
@classmethod
def from_documents(cls, documents, **kwargs): ...
def as_query_engine(self, **kwargs): ...
def as_retriever(self, **kwargs): ...
class SummaryIndex:
@classmethod
def from_documents(cls, documents, **kwargs): ...
class TreeIndex:
@classmethod
def from_documents(cls, documents, **kwargs): ...
class KnowledgeGraphIndex:
@classmethod
def from_documents(cls, documents, **kwargs): ...Query engines that orchestrate retrieval and response generation with support for various strategies including basic retrieval, sub-question decomposition, routing, and multi-step reasoning.
class RetrieverQueryEngine:
def __init__(self, retriever, response_synthesizer=None): ...
def query(self, query_bundle): ...
class RouterQueryEngine:
def __init__(self, selector, query_engines): ...
class SubQuestionQueryEngine:
def __init__(self, question_gen, query_engines): ...Document loading, parsing, and chunking functionality for various file formats with intelligent text splitting strategies.
class SimpleDirectoryReader:
def __init__(self, input_dir=None, input_files=None, **kwargs): ...
def load_data(self): ...
class Document:
def __init__(self, text, metadata=None, **kwargs): ...
class SentenceSplitter:
def __init__(self, chunk_size=1024, chunk_overlap=200, **kwargs): ...
def split_text(self, text): ...Unified interface for various language models with support for completion, chat, and function calling APIs.
class LLM:
def complete(self, prompt, **kwargs): ...
def chat(self, messages, **kwargs): ...
class OpenAI(LLM):
def __init__(self, model="gpt-3.5-turbo", **kwargs): ...Text embedding functionality supporting various providers for semantic similarity and vector search operations.
class BaseEmbedding:
def get_text_embedding(self, text): ...
def get_query_embedding(self, query): ...
class OpenAIEmbedding(BaseEmbedding):
def __init__(self, model="text-embedding-ada-002", **kwargs): ...Advanced retrieval strategies including fusion, hierarchical, and routing approaches for sophisticated document retrieval patterns.
class AutoMergingRetriever:
def __init__(self, vector_retriever, storage_context, **kwargs): ...
class QueryFusionRetriever:
def __init__(self, retrievers, similarity_top_k=None, **kwargs): ...
class RouterRetriever:
def __init__(self, selector, retriever_tools, **kwargs): ...Response generation strategies for combining retrieved context into coherent answers with various summarization approaches.
class TreeSummarize:
def __init__(self, **kwargs): ...
class Refine:
def __init__(self, **kwargs): ...
def get_response_synthesizer(response_mode="compact", **kwargs): ...Multi-agent systems and workflow orchestration for complex reasoning tasks, tool usage, and multi-step problem solving.
class ReActAgent:
def __init__(self, tools, llm, **kwargs): ...
def chat(self, message): ...
class Workflow:
def __init__(self, **kwargs): ...
def add_step(self, step_fn): ...
@step
def custom_step(ctx: Context, ev: Event) -> Event: ...Storage backends and global configuration for persisting indices, managing contexts, and configuring system-wide settings.
class StorageContext:
@classmethod
def from_defaults(cls, persist_dir=None, **kwargs): ...
def persist(self, persist_dir=None): ...
class Settings:
llm: LLM = None
embed_model: BaseEmbedding = None
node_parser: NodeParser = NoneTemplate system for customizing LLM prompts with support for various formatting options and conditional logic.
class PromptTemplate:
def __init__(self, template, **kwargs): ...
def format(self, **kwargs): ...
class ChatPromptTemplate:
def __init__(self, message_templates, **kwargs): ...Install with Tessl CLI
npx tessl i tessl/pypi-llama-index