Data framework for your LLM application
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
Conversational interfaces that maintain context and enable back-and-forth interactions with your data in LlamaIndex.TS.
import { VectorStoreIndex } from "llamaindex";
// Or from specific submodules
import { ContextChatEngine, SimpleChatEngine } from "llamaindex/engines";Chat engines in LlamaIndex.TS provide conversational interfaces that can maintain context across multiple turns of conversation. Unlike query engines that handle single queries, chat engines are designed for interactive, multi-turn conversations while leveraging your indexed data.
All chat engines implement the base interface.
interface BaseChatEngine {
chat(message: string, options?: ChatOptions): Promise<EngineResponse>;
achat(message: string, options?: ChatOptions): AsyncIterable<EngineResponse>;
reset(): void;
chatHistory: ChatMessage[];
}
interface ChatOptions {
stream?: boolean;
chatHistory?: ChatMessage[];
}
interface ChatMessage {
role: MessageType;
content: string;
}
type MessageType = "system" | "user" | "assistant";A chat engine that uses retrieval to provide context-aware responses while maintaining conversation history.
class ContextChatEngine implements BaseChatEngine {
constructor(args: {
retriever: BaseRetriever;
memory?: BaseMemory;
systemPrompt?: string;
nodePostprocessors?: BasePostprocessor[];
contextRole?: string;
});
chat(message: string, options?: ChatOptions): Promise<EngineResponse>;
achat(message: string, options?: ChatOptions): AsyncIterable<EngineResponse>;
reset(): void;
chatHistory: ChatMessage[];
retriever: BaseRetriever;
memory: BaseMemory;
systemPrompt?: string;
}A basic chat engine that maintains conversation history without retrieval.
class SimpleChatEngine implements BaseChatEngine {
constructor(args: {
llm: LLM;
memory?: BaseMemory;
systemPrompt?: string;
});
chat(message: string, options?: ChatOptions): Promise<EngineResponse>;
achat(message: string, options?: ChatOptions): AsyncIterable<EngineResponse>;
reset(): void;
chatHistory: ChatMessage[];
llm: LLM;
memory: BaseMemory;
}A chat engine that condenses the conversation history and current question into a standalone question for better retrieval.
class CondenseQuestionChatEngine implements BaseChatEngine {
constructor(args: {
queryEngine: BaseQueryEngine;
memory?: BaseMemory;
systemPrompt?: string;
condenseQuestionPrompt?: string;
});
chat(message: string, options?: ChatOptions): Promise<EngineResponse>;
achat(message: string, options?: ChatOptions): AsyncIterable<EngineResponse>;
reset(): void;
chatHistory: ChatMessage[];
queryEngine: BaseQueryEngine;
memory: BaseMemory;
}Interface for chat memory implementations.
interface BaseMemory {
get(initialTokenCount?: number): ChatMessage[];
getAll(): ChatMessage[];
put(message: ChatMessage): void;
set(messages: ChatMessage[]): void;
reset(): void;
}Simple in-memory buffer for storing chat history.
class ChatMemoryBuffer implements BaseMemory {
constructor(args?: {
tokenLimit?: number;
chatHistory?: ChatMessage[];
});
get(initialTokenCount?: number): ChatMessage[];
getAll(): ChatMessage[];
put(message: ChatMessage): void;
set(messages: ChatMessage[]): void;
reset(): void;
tokenLimit?: number;
chatHistory: ChatMessage[];
}import { VectorStoreIndex, Document } from "llamaindex";
// Create knowledge base
const documents = [
new Document({ text: "LlamaIndex is a data framework for LLM applications." }),
new Document({ text: "It supports various document types and vector stores." }),
new Document({ text: "You can build chatbots and Q&A systems with it." }),
];
const index = await VectorStoreIndex.fromDocuments(documents);
// Create context chat engine
const chatEngine = index.asChatEngine({
chatMode: "context", // Use context-aware chat
systemPrompt: "You are a helpful assistant that answers questions about LlamaIndex.",
});
// Start conversation
const response1 = await chatEngine.chat("What is LlamaIndex?");
console.log("Assistant:", response1.toString());
// Continue conversation with context
const response2 = await chatEngine.chat("What can I build with it?");
console.log("Assistant:", response2.toString());
// Check conversation history
console.log("Chat history:", chatEngine.chatHistory);import { SimpleChatEngine, OpenAI } from "llamaindex";
// Create simple chat engine
const simpleChatEngine = new SimpleChatEngine({
llm: new OpenAI({ model: "gpt-3.5-turbo" }),
systemPrompt: "You are a helpful assistant.",
});
// Have a conversation
const response = await simpleChatEngine.chat("Hello! How are you?");
console.log("Response:", response.toString());// Enable streaming for real-time responses
const response = await chatEngine.chat("Explain vector databases", {
stream: true
});
// For streaming, use achat
for await (const chunk of chatEngine.achat("Tell me about embeddings")) {
process.stdout.write(chunk.response);
}import { ContextChatEngine, ChatMemoryBuffer } from "llamaindex";
// Create chat engine with custom memory
const customMemory = new ChatMemoryBuffer({
tokenLimit: 4000, // Limit context window
chatHistory: [
{ role: "system", content: "You are an expert on AI and machine learning." }
],
});
const chatEngine = new ContextChatEngine({
retriever: index.asRetriever(),
memory: customMemory,
systemPrompt: "Answer questions about AI using the provided context.",
});import { CondenseQuestionChatEngine } from "llamaindex/engines";
// Create condense question chat engine for better multi-turn conversations
const condenseEngine = new CondenseQuestionChatEngine({
queryEngine: index.asQueryEngine(),
condenseQuestionPrompt: `
Given the conversation history and a follow-up question,
rephrase the follow-up question to be a standalone question.
Chat History: {chat_history}
Follow-up Input: {question}
Standalone Question:
`,
});
// Multi-turn conversation
await condenseEngine.chat("What is machine learning?");
await condenseEngine.chat("How does it differ from deep learning?"); // Will be condensed to standalone questionconst chatEngine = index.asChatEngine({
chatMode: "context",
systemPrompt: `
You are an expert technical documentation assistant.
Guidelines:
- Always provide accurate, technical information
- Include code examples when relevant
- Cite your sources when using retrieved context
- If you don't know something, say so clearly
- Keep responses concise but comprehensive
`,
});// Access full conversation history
const history = chatEngine.chatHistory;
console.log("Conversation turns:", history.length);
// Filter by role
const userMessages = history.filter(msg => msg.role === "user");
const assistantMessages = history.filter(msg => msg.role === "assistant");
// Reset conversation
chatEngine.reset();
console.log("History after reset:", chatEngine.chatHistory.length); // 0// Save conversation to storage
const saveConversation = (chatEngine: BaseChatEngine, filename: string) => {
const conversation = {
history: chatEngine.chatHistory,
timestamp: new Date().toISOString(),
};
// Save to file or database
// fs.writeFileSync(filename, JSON.stringify(conversation, null, 2));
};
// Load conversation from storage
const loadConversation = (chatEngine: BaseChatEngine, conversationData: any) => {
chatEngine.chatHistory = conversationData.history;
};import { ChatMemoryBuffer } from "llamaindex";
// Create memory with token limit to manage context window
const limitedMemory = new ChatMemoryBuffer({
tokenLimit: 3000, // Adjust based on your model's context window
});
const chatEngine = new ContextChatEngine({
retriever: index.asRetriever(),
memory: limitedMemory,
});
// The memory will automatically truncate old messages when limit is reachedimport { QueryEngineTool, ReActAgent } from "llamaindex";
// Convert chat engine to tool for use with agents
const chatTool = new QueryEngineTool({
queryEngine: chatEngine, // Chat engines implement BaseQueryEngine
metadata: {
name: "knowledge_chat",
description: "Have a conversation about the knowledge base",
},
});
// Use with agent
const agent = new ReActAgent({
tools: [chatTool],
llm: /* your LLM */,
});// For multi-modal conversations (requires compatible LLM)
const multiModalResponse = await chatEngine.chat("What's in this image?", {
chatHistory: [
{
role: "user",
content: [
{ type: "text", text: "Analyze this image:" },
{ type: "image_url", image_url: { url: "data:image/jpeg;base64,..." } }
]
}
]
});// Handle multiple chat sessions concurrently
const handleMultipleChats = async (sessions: Array<{chatEngine: BaseChatEngine, message: string}>) => {
const responses = await Promise.all(
sessions.map(session => session.chatEngine.chat(session.message))
);
return responses;
};// Simple response caching for common questions
class CachedChatEngine {
private cache = new Map<string, EngineResponse>();
constructor(private chatEngine: BaseChatEngine) {}
async chat(message: string): Promise<EngineResponse> {
const cacheKey = message.toLowerCase().trim();
if (this.cache.has(cacheKey)) {
return this.cache.get(cacheKey)!;
}
const response = await this.chatEngine.chat(message);
this.cache.set(cacheKey, response);
return response;
}
}const safeChat = async (chatEngine: BaseChatEngine, message: string): Promise<EngineResponse | null> => {
try {
// Validate input
if (!message || message.trim().length === 0) {
console.warn("Empty message provided");
return null;
}
const response = await chatEngine.chat(message);
// Validate response
if (!response.response || response.response.trim().length === 0) {
console.warn("Empty response received");
return null;
}
return response;
} catch (error) {
console.error("Chat error:", error);
// Handle specific errors
if (error.message.includes("context window")) {
console.error("Context window exceeded - consider resetting conversation");
chatEngine.reset();
}
return null;
}
};// Choose the right chat engine for your use case
const createChatEngine = (useCase: string, index: VectorStoreIndex) => {
switch (useCase) {
case "simple":
// Basic conversation without knowledge base
return new SimpleChatEngine({
llm: /* your LLM */,
});
case "knowledge":
// Conversations with knowledge base access
return index.asChatEngine({ chatMode: "context" });
case "complex":
// Multi-turn conversations with better context handling
return new CondenseQuestionChatEngine({
queryEngine: index.asQueryEngine(),
});
default:
return index.asChatEngine();
}
};// Configure for high-quality conversations
const highQualityChatEngine = new ContextChatEngine({
retriever: index.asRetriever({
similarityTopK: 3, // Focused context
}),
memory: new ChatMemoryBuffer({
tokenLimit: 4000, // Manage context window
}),
systemPrompt: `
You are a knowledgeable assistant. Use the provided context to give accurate answers.
If the context doesn't contain relevant information, say so clearly.
Always be helpful and conversational while staying factual.
`,
});// Add logging and monitoring
const monitoredChat = async (chatEngine: BaseChatEngine, message: string) => {
const startTime = Date.now();
try {
const response = await chatEngine.chat(message);
const duration = Date.now() - startTime;
console.log({
timestamp: new Date().toISOString(),
message: message.substring(0, 100),
responseLength: response.response.length,
sourceCount: response.sourceNodes?.length || 0,
duration: `${duration}ms`,
historyLength: chatEngine.chatHistory.length,
});
return response;
} catch (error) {
console.error({
timestamp: new Date().toISOString(),
message: message.substring(0, 100),
error: error.message,
duration: `${Date.now() - startTime}ms`,
});
throw error;
}
};