Building applications with LLMs through composability
—
This document contains all critical implementation details, gotchas, and common mistakes from the LangChain documentation. Read this carefully to avoid common pitfalls.
CRITICAL: Agents must be invoked with messages in a dictionary with a "messages" key.
# ✅ CORRECT: Pass messages in dict with "messages" key
agent.invoke({"messages": [HumanMessage(content="Hello")]})
# ❌ WRONG: Don't pass messages directly
agent.invoke([HumanMessage(content="Hello")]) # Will fail!Why: The agent expects a state dictionary, not a list of messages. The state dictionary contains the messages field along with other optional state fields.
CRITICAL: Agent results are returned as a dictionary containing the full state. The AI response is the last message in the messages list.
result = agent.invoke({"messages": [HumanMessage(content="Hello")]})
# ✅ CORRECT: Access response content from last message
response = result["messages"][-1].content
# ✅ CORRECT: Access full messages list
all_messages = result["messages"]
# ❌ WRONG: Don't treat result as a message
response = result.content # Will fail!
# ❌ WRONG: Don't try to access result directly
response = result[-1].content # Will fail!Why: The result is a state dictionary with a messages field. The messages list contains the entire conversation including the AI's response as the last element.
CRITICAL: Tool docstrings are REQUIRED and sent to the LLM to determine when to use the tool.
# ✅ CORRECT: Has docstring
@tool
def my_tool(param: str) -> str:
"""This docstring is REQUIRED and sent to the LLM.
Without it, the LLM won't understand when to use this tool.
Args:
param: Description of parameter
Returns:
Description of return value
"""
return result
# ❌ WRONG: Missing docstring
@tool
def my_tool(param: str) -> str:
return result # LLM won't know when/how to use this!Why: The docstring is used to generate the tool schema that's sent to the LLM. Without it, the LLM cannot understand the tool's purpose or parameters.
Best Practice: Include clear descriptions of:
CRITICAL: Model strings must include the provider in the format "provider:model-name".
# ✅ CORRECT: Include provider
model = init_chat_model("openai:gpt-4o")
model = init_chat_model("anthropic:claude-3-5-sonnet-20241022")
model = init_chat_model("google_vertexai:gemini-1.5-pro")
# ❌ WRONG: Missing provider
model = init_chat_model("gpt-4o") # Will fail!
model = init_chat_model("claude-3-5") # Will fail!Why: LangChain needs to know which provider to use to instantiate the correct model class. The provider prefix is required for proper model initialization.
Format: "provider:model-name" where:
provider: One of the supported providers (openai, anthropic, google_vertexai, bedrock, etc.)model-name: The specific model identifier for that providerCRITICAL: Custom state schemas must extend AgentState, not replace it.
# ✅ CORRECT: Extend AgentState
from langchain.agents import AgentState
class CustomState(AgentState):
user_name: str
conversation_count: int
agent = create_agent(
model="openai:gpt-4o",
state_schema=CustomState
)
# ❌ WRONG: Create independent TypedDict
from typing import TypedDict
class CustomState(TypedDict):
messages: list # Missing proper annotations
user_name: str # This won't work properly!
agent = create_agent(
model="openai:gpt-4o",
state_schema=CustomState
)Why: AgentState includes required fields like messages with proper annotations and reducers. Custom schemas must inherit from AgentState to maintain compatibility with the agent execution framework.
Required Fields in AgentState:
messages: list[AnyMessage] - Required, with special reducer for message accumulationstructured_response: Any - Optional, present when using response_formatjump_to: str - Optional, ephemeral field for middleware control flowCRITICAL: When using checkpointers for persistence, you must provide a thread_id in the config.
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
agent = create_agent(
model="openai:gpt-4o",
checkpointer=checkpointer
)
# ✅ CORRECT: Provide thread_id in config
config = {"configurable": {"thread_id": "conversation-1"}}
agent.invoke({"messages": [...]}, config=config)
# ❌ WRONG: No config provided
agent.invoke({"messages": [...]}) # Each call is independent!
# ❌ WRONG: Missing thread_id
config = {"configurable": {}}
agent.invoke({"messages": [...]}, config=config) # Won't persist!Why: The checkpointer uses thread_id to identify which conversation to load/save. Without it, each invocation is treated as a new conversation with no memory of previous messages.
Best Practice: Use meaningful thread IDs like user IDs or session IDs to organize conversations.
CRITICAL: Tools should return JSON-serializable types. Complex objects may not be properly passed to the LLM.
# ✅ CORRECT: Return serializable types
@tool
def get_data() -> dict:
"""Get data."""
return {"name": "John", "age": 30}
@tool
def get_list() -> list[str]:
"""Get list."""
return ["item1", "item2", "item3"]
@tool
def get_string() -> str:
"""Get string."""
return "some result"
# ❌ WRONG: Return complex objects
@tool
def get_user() -> User:
"""Get user."""
return User(name="John") # Custom object may not serialize!
# ✅ CORRECT: Convert complex objects to dicts
@tool
def get_user() -> dict:
"""Get user."""
user = User(name="John")
return {"name": user.name, "email": user.email}Why: Tool results are converted to ToolMessage content and sent to the LLM. Only JSON-serializable types can be properly transmitted.
Recommended Return Types:
str - Text responsesint, float - Numeric valuesdict - Structured datalist - Collectionsbool - Boolean flagsCRITICAL: Understand how to access different message types in the result.
result = agent.invoke({"messages": [HumanMessage(content="Hello")]})
# ✅ Access all messages
all_messages = result["messages"]
# ✅ Access last message (AI response)
last_message = result["messages"][-1]
response_content = result["messages"][-1].content
# ✅ Filter by message type
from langchain.messages import AIMessage, HumanMessage, ToolMessage
human_messages = [m for m in result["messages"] if isinstance(m, HumanMessage)]
ai_messages = [m for m in result["messages"] if isinstance(m, AIMessage)]
tool_messages = [m for m in result["messages"] if isinstance(m, ToolMessage)]
# ✅ Check for tool calls in AI messages
last_ai_msg = result["messages"][-1]
if isinstance(last_ai_msg, AIMessage) and last_ai_msg.tool_calls:
for tool_call in last_ai_msg.tool_calls:
print(f"Tool: {tool_call['name']}, Args: {tool_call['args']}")
# ✅ Access usage metadata
if hasattr(last_message, 'usage_metadata') and last_message.usage_metadata:
tokens = last_message.usage_metadata['total_tokens']
print(f"Used {tokens} tokens")CRITICAL: Tool parameters must have type hints for proper schema generation.
# ✅ CORRECT: Full type hints
@tool
def search(query: str, limit: int = 10) -> list[dict]:
"""Search for items."""
return []
# ❌ WRONG: Missing type hints
@tool
def search(query, limit=10):
"""Search for items."""
return [] # Schema generation will fail or be incomplete!Why: Type hints are used to generate the JSON schema that describes the tool to the LLM. Without them, the LLM won't know what parameters to pass.
CRITICAL: Async tools are automatically detected, but you must use async invocation methods.
@tool
async def async_tool(param: str) -> str:
"""Async tool."""
await asyncio.sleep(1)
return "result"
agent = create_agent(
model="openai:gpt-4o",
tools=[async_tool]
)
# ✅ CORRECT: Use async invocation
result = await agent.ainvoke({"messages": [...]})
# ❌ WRONG: Use sync invocation with async tools
result = agent.invoke({"messages": [...]}) # May not work correctly!CRITICAL: Use ToolException for tool errors to allow the LLM to see and handle errors.
from langchain.tools import tool, ToolException
@tool
def divide(a: float, b: float) -> float:
"""Divide two numbers."""
if b == 0:
# ✅ CORRECT: Raise ToolException with helpful message
raise ToolException(
"Cannot divide by zero. Please provide a non-zero denominator."
)
return a / b
@tool
def divide_bad(a: float, b: float) -> float:
"""Divide two numbers."""
if b == 0:
# ❌ WRONG: Raise generic exception
raise ValueError("Division by zero") # LLM won't see this properly!
return a / bWhy: ToolException creates a ToolMessage with the error that's sent to the LLM, allowing it to understand what went wrong and potentially retry with different parameters. Generic exceptions may cause the agent to fail without useful feedback.
IMPORTANT: Temperature controls randomness. Choose appropriately for your use case.
# Deterministic (good for factual tasks, math, structured output)
model = init_chat_model("openai:gpt-4o", temperature=0)
# Balanced (good for general conversation)
model = init_chat_model("openai:gpt-4o", temperature=0.7)
# Creative (good for creative writing, brainstorming)
model = init_chat_model("openai:gpt-4o", temperature=1.5)CRITICAL: Streaming yields chunks, not complete messages. Handle accordingly.
# ✅ CORRECT: Handle chunks properly
for chunk in agent.stream({"messages": [...]}):
if "messages" in chunk:
for message in chunk["messages"]:
if hasattr(message, 'content') and message.content:
print(message.content, end="", flush=True)
# ❌ WRONG: Assume chunk has the same structure as invoke result
for chunk in agent.stream({"messages": [...]}):
print(chunk["messages"][-1].content) # May not exist in every chunk!Why: Streaming returns incremental updates (chunks), not the full state on every iteration. Chunks may be partial or contain only specific fields.
IMPORTANT: Use embed_query() for queries and embed_documents() for documents. They may be optimized differently by the provider.
from langchain.embeddings import init_embeddings
embeddings = init_embeddings("openai:text-embedding-3-small")
# ✅ CORRECT: Use embed_query for search queries
query_vector = embeddings.embed_query("What is machine learning?")
# ✅ CORRECT: Use embed_documents for document corpus
doc_vectors = embeddings.embed_documents([
"Machine learning is...",
"Deep learning is..."
])
# ❌ WRONG: Use embed_documents for queries
query_vector = embeddings.embed_documents(["What is machine learning?"])[0]Why: Some embedding models (especially retrieval-focused ones) optimize query and document embeddings differently for better search performance.
CRITICAL: When subclassing BaseTool, implement _run() or _arun(), not invoke().
from langchain.tools import BaseTool
# ✅ CORRECT: Implement _run
class MyTool(BaseTool):
name: str = "my_tool"
description: str = "Description"
def _run(self, param: str) -> str:
return "result"
async def _arun(self, param: str) -> str:
return self._run(param)
# ❌ WRONG: Override invoke
class MyTool(BaseTool):
name: str = "my_tool"
description: str = "Description"
def invoke(self, param: str) -> str: # Don't override this!
return "result"Why: The invoke() method handles schema validation, error wrapping, and other framework concerns. The _run() and _arun() methods are the extension points for custom tool logic.
IMPORTANT: Making fields configurable allows runtime parameter overrides.
# Make temperature configurable
model = init_chat_model(
"openai:gpt-4o",
configurable_fields=["temperature"],
temperature=0.5 # default
)
# ✅ CORRECT: Override at runtime
response = model.invoke(
[...],
config={"configurable": {"temperature": 0.9}}
)
# ❌ WRONG: Try to pass parameter directly
response = model.invoke([...], temperature=0.9) # Won't work!IMPORTANT: Middleware executes in the order it's provided.
from langchain.agents.middleware import (
before_agent, after_agent, before_model, after_model
)
agent = create_agent(
model="openai:gpt-4o",
middleware=[
before_agent, # Runs first
before_model, # Runs second (before model call)
after_model, # Runs third (after model call)
after_agent # Runs last
]
)CRITICAL: When using response_format, access structured output via structured_response field.
from pydantic import BaseModel
class WeatherReport(BaseModel):
location: str
temperature: float
agent = create_agent(
model="openai:gpt-4o",
response_format=WeatherReport
)
result = agent.invoke({"messages": [...]})
# ✅ CORRECT: Access structured_response field
weather = result["structured_response"]
print(f"{weather.location}: {weather.temperature}°F")
# ❌ WRONG: Try to parse from message content
weather = WeatherReport.parse_raw(result["messages"][-1].content)IMPORTANT: Most providers use environment variables for authentication. Set them before initializing models.
# OpenAI
export OPENAI_API_KEY="sk-..."
# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."
# Google
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/credentials.json"
# AWS Bedrock (uses standard AWS credentials)
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"# Then initialize without explicit credentials
model = init_chat_model("openai:gpt-4o") # Uses OPENAI_API_KEY
model = init_chat_model("anthropic:claude-3-5-sonnet-20241022") # Uses ANTHROPIC_API_KEYIMPORTANT: Choose the right trimming strategy for your use case.
from langchain.messages import trim_messages
# Keep most recent messages (default for conversational agents)
trimmed = trim_messages(messages, max_tokens=1000, strategy="last")
# Keep oldest messages (useful for preserving initial context)
trimmed = trim_messages(messages, max_tokens=1000, strategy="first")
# Always include system messages regardless of trimming
trimmed = trim_messages(
messages,
max_tokens=1000,
strategy="last",
include_system=True # System messages won't be trimmed
)IMPORTANT: Use batch methods for processing multiple inputs efficiently.
# ✅ CORRECT: Use batch for multiple inputs
results = agent.batch([
{"messages": [HumanMessage(content="Query 1")]},
{"messages": [HumanMessage(content="Query 2")]},
{"messages": [HumanMessage(content="Query 3")]}
])
# ❌ WRONG: Loop with invoke (slower, no parallelization)
results = []
for query in queries:
result = agent.invoke({"messages": [HumanMessage(content=query)]})
results.append(result)IMPORTANT: AWS Bedrock and some other providers require full model IDs including version.
# ✅ CORRECT: Full Bedrock model ID with version
model = init_chat_model("bedrock:anthropic.claude-3-sonnet-20240229-v1:0")
# ❌ WRONG: Shortened Bedrock model ID
model = init_chat_model("bedrock:claude-3-sonnet") # Won't work!
# ✅ CORRECT: Standard OpenAI format
model = init_chat_model("openai:gpt-4o")
# ✅ CORRECT: Standard Anthropic format
model = init_chat_model("anthropic:claude-3-5-sonnet-20241022"){"messages": [...]} dictresult.content instead of result["messages"][-1].content"gpt-4o" instead of "openai:gpt-4o")thread_id in config when using persistenceToolExceptionAgentStateinvoke() instead of ainvoke() with async toolsembed_documents() for queries instead of embed_query()Install with Tessl CLI
npx tessl i tessl/pypi-langchain@1.2.1