CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-langchain

Building applications with LLMs through composability

Pending
Overview
Eval results
Files

middleware.mddocs/advanced/

Middleware

The middleware system provides powerful customization of agent behavior through lifecycle hooks and execution wrappers. Middleware allows you to intercept and modify agent execution at key points: before/after agent execution, before/after model calls, and with full control over model and tool call execution.

Middleware is composable - you can combine multiple middleware plugins to build sophisticated agent behaviors like retry logic, fallback models, human-in-the-loop workflows, and more.

Capabilities

Lifecycle Hooks

Lifecycle hooks allow you to run code at specific points in the agent execution lifecycle. Hooks receive the current state or request/response objects and can modify them before returning.

Before Agent Execution

Run code once at the start of agent execution, before any model calls:

def before_agent(func: Callable[[AgentState], AgentState]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import before_agent, AgentState

@before_agent
def log_start(state: AgentState) -> AgentState:
    print(f"Starting agent with {len(state['messages'])} messages")
    return state

Async Support:

The before_agent decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import before_agent

@before_agent
async def async_log_start(state: AgentState) -> AgentState:
    print("Starting agent execution")
    return state

Before Model Call

Run code before each model invocation. Useful for modifying prompts, logging, or controlling flow:

def before_model(func: Callable[[ModelRequest], ModelRequest]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import before_model, ModelRequest

@before_model
def log_model_call(request: ModelRequest) -> ModelRequest:
    print(f"Calling model with {len(request['state']['messages'])} messages")
    return request

With Flow Control:

from langchain.agents.middleware import before_model, hook_config

@before_model
@hook_config(can_jump_to=["tools", "model", "end"])
def conditional_skip(request: ModelRequest) -> ModelRequest:
    # Skip model call if too many messages
    if len(request['state']['messages']) > 100:
        request['state']['jump_to'] = "end"
    return request

Async Support:

The before_model decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import before_model

@before_model
async def async_before(request: ModelRequest) -> ModelRequest:
    return request

After Model Call

Run code after each model invocation. Useful for logging responses, modifying output, or controlling flow:

def after_model(func: Callable[[ModelResponse], ModelResponse]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import after_model, ModelResponse

@after_model
def log_model_response(response: ModelResponse) -> ModelResponse:
    print(f"Model returned: {response}")
    return response

With Flow Control:

from langchain.agents.middleware import after_model, hook_config

@after_model
@hook_config(can_jump_to=["tools", "model", "end"])
def force_retry(response: ModelResponse) -> ModelResponse:
    # Retry model call if response is empty
    if not response.get("content"):
        response['state']['jump_to'] = "model"
    return response

Async Support:

The after_model decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import after_model

@after_model
async def async_after(response: ModelResponse) -> ModelResponse:
    return response

After Agent Execution

Run code once at the end of agent execution, after all processing is complete:

def after_agent(func: Callable[[AgentState], AgentState]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import after_agent, AgentState

@after_agent
def log_completion(state: AgentState) -> AgentState:
    print(f"Agent completed with {len(state['messages'])} messages")
    return state

Async Support:

The after_agent decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import after_agent

@after_agent
async def async_log_completion(state: AgentState) -> AgentState:
    return state

Execution Wrappers

Execution wrappers provide complete control over model and tool execution. Unlike hooks, wrappers receive a handler callback that performs the actual execution, allowing you to implement retry logic, fallbacks, caching, and more.

Model Call Wrapper

Wrap model execution with custom logic:

def wrap_model_call(func: Callable[[Callable, ModelRequest], ModelResponse]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import wrap_model_call, ModelRequest, ModelResponse

@wrap_model_call
def retry_model(handler: Callable, request: ModelRequest) -> ModelResponse:
    """Retry model call up to 3 times on failure."""
    for attempt in range(3):
        try:
            return handler(request)
        except Exception as e:
            if attempt == 2:
                raise
            print(f"Retry {attempt + 1} after error: {e}")
    return handler(request)

Async Support:

The wrap_model_call decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import wrap_model_call

@wrap_model_call
async def async_retry_model(handler: Callable, request: ModelRequest) -> ModelResponse:
    try:
        return await handler(request)
    except Exception:
        return await handler(request)  # Retry once

Use Cases:

  • Retry logic on failure
  • Fallback to different models
  • Response rewriting or filtering
  • Caching model responses
  • Rate limiting

Tool Call Wrapper

Wrap tool execution with custom logic:

def wrap_tool_call(func: Callable[[Callable, ToolCallRequest], Any]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import wrap_tool_call

@wrap_tool_call
def cache_tool_calls(handler: Callable, request: ToolCallRequest) -> Any:
    """Cache tool call results."""
    cache_key = f"{request['tool_name']}:{request['tool_args']}"
    if cache_key in cache:
        return cache[cache_key]

    result = handler(request)
    cache[cache_key] = result
    return result

Async Support:

The wrap_tool_call decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import wrap_tool_call

@wrap_tool_call
async def async_wrap_tool(handler: Callable, request: ToolCallRequest) -> Any:
    return await handler(request)

Use Cases:

  • Tool retry on failure
  • Modifying tool inputs or outputs
  • Caching tool results
  • Access control for tools
  • Tool call logging

Dynamic Prompts

Generate system prompts dynamically based on the request context:

def dynamic_prompt(func: Callable[[ModelRequest], str]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import dynamic_prompt, ModelRequest

@dynamic_prompt
def time_aware_prompt(request: ModelRequest) -> str:
    """Add current time to system prompt."""
    from datetime import datetime
    current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    return f"You are a helpful assistant. Current time: {current_time}"

Async Support:

The dynamic_prompt decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import dynamic_prompt

@dynamic_prompt
async def async_dynamic_prompt(request: ModelRequest) -> str:
    return "Dynamic system prompt"

Hook Configuration

The hook_config decorator marks valid jump destinations for flow control:

def hook_config(can_jump_to: list[str]) -> Callable: ...

Usage:

from langchain.agents.middleware import before_model, hook_config

@before_model
@hook_config(can_jump_to=["tools", "model", "end"])
def conditional_jump(request: ModelRequest) -> ModelRequest:
    if some_condition:
        request['state']['jump_to'] = "end"
    return request

Valid Jump Targets:

  • "tools" - Jump to tool execution
  • "model" - Jump to model call (useful for retries)
  • "end" - Jump to end of execution

Middleware Base Class

All middleware inherits from the AgentMiddleware base class:

class AgentMiddleware:
    """
    Base class for middleware plugins.

    Middleware can be created by subclassing this class or by using
    the decorator functions (before_model, after_model, etc.).
    """
    pass

Custom Middleware Class:

from langchain.agents.middleware import AgentMiddleware, ModelRequest, ModelResponse

class CustomMiddleware(AgentMiddleware):
    def __init__(self, config: dict):
        self.config = config

    def before_model(self, request: ModelRequest) -> ModelRequest:
        # Custom logic
        return request

    def after_model(self, response: ModelResponse) -> ModelResponse:
        # Custom logic
        return response

Built-in Middleware

LangChain provides several pre-built middleware classes for common use cases:

Model Retry Middleware

Automatically retry model calls on failure:

class ModelRetryMiddleware(AgentMiddleware):
    """
    Retry model calls on failure with configurable attempts and backoff.

    Parameters:
        max_retries: Maximum number of retry attempts
        backoff_factor: Exponential backoff multiplier
        retry_on: Exception types to retry on
    """
    def __init__(
        self,
        max_retries: int = 3,
        backoff_factor: float = 2.0,
        retry_on: tuple[type[Exception], ...] = (Exception,)
    ): ...

Usage:

from langchain.agents import create_agent
from langchain.agents.middleware import ModelRetryMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[ModelRetryMiddleware(max_retries=3)]
)

Model Fallback Middleware

Switch to fallback model on error:

class ModelFallbackMiddleware(AgentMiddleware):
    """
    Use fallback model if primary model fails.

    Parameters:
        fallback_models: List of fallback model identifiers to try in order
    """
    def __init__(self, fallback_models: list[str]): ...

Usage:

from langchain.agents.middleware import ModelFallbackMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[ModelFallbackMiddleware(
        fallback_models=["anthropic:claude-3-5-sonnet-20241022", "openai:gpt-3.5-turbo"]
    )]
)

Tool Call Limit Middleware

Limit the number of tool calls per execution:

class ToolCallLimitMiddleware(AgentMiddleware):
    """
    Limit the number of tool calls per agent execution.

    Parameters:
        max_tool_calls: Maximum number of tool calls allowed
    """
    def __init__(self, max_tool_calls: int): ...

Usage:

from langchain.agents.middleware import ToolCallLimitMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[ToolCallLimitMiddleware(max_tool_calls=10)]
)

Tool Retry Middleware

Retry tool calls on failure:

class ToolRetryMiddleware(AgentMiddleware):
    """
    Retry tool calls on failure.

    Parameters:
        max_retries: Maximum number of retry attempts per tool call
    """
    def __init__(self, max_retries: int = 3): ...

Usage:

from langchain.agents.middleware import ToolRetryMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[ToolRetryMiddleware(max_retries=2)]
)

Human in the Loop Middleware

Pause execution for human confirmation or input:

class HumanInTheLoopMiddleware(AgentMiddleware):
    """
    Pause agent execution for human review and approval.

    Parameters:
        interrupt_on: Configuration for when to interrupt
    """
    def __init__(self, interrupt_on: InterruptOnConfig): ...

class InterruptOnConfig:
    """Configuration for human-in-the-loop interruptions."""
    pass

Usage:

from langchain.agents.middleware import HumanInTheLoopMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[HumanInTheLoopMiddleware(interrupt_on=...)]
)

LLM Tool Emulator

Emulate tool calls using LLM when tools are not available:

class LLMToolEmulator(AgentMiddleware):
    """
    Emulate tool execution using LLM calls instead of actual tool execution.

    Useful for simulation or when tools are unavailable.
    """
    def __init__(self): ...

LLM Tool Selector Middleware

Use LLM to intelligently select which tools to use:

class LLMToolSelectorMiddleware(AgentMiddleware):
    """
    Use LLM to select relevant tools before execution.

    Useful when agent has many tools available.
    """
    def __init__(self): ...

Filesystem File Search Middleware

Search filesystem for files:

class FilesystemFileSearchMiddleware(AgentMiddleware):
    """
    Provide file search capabilities to the agent.

    Parameters:
        search_paths: Directories to search
        file_patterns: File patterns to match
    """
    def __init__(
        self,
        search_paths: list[str],
        file_patterns: list[str] = ["*"]
    ): ...

Shell Tool Middleware

Execute shell commands with security policies:

class ShellToolMiddleware(AgentMiddleware):
    """
    Allow agent to execute shell commands with execution policy controls.

    Parameters:
        execution_policy: Policy controlling what commands can be executed
        redaction_rules: Rules for redacting sensitive output
    """
    def __init__(
        self,
        execution_policy: HostExecutionPolicy | DockerExecutionPolicy | CodexSandboxExecutionPolicy,
        redaction_rules: list[RedactionRule] = []
    ): ...

class HostExecutionPolicy:
    """Execute commands on host system."""
    pass

class DockerExecutionPolicy:
    """Execute commands in Docker container."""
    pass

class CodexSandboxExecutionPolicy:
    """Execute commands in Codex sandbox."""
    pass

class RedactionRule:
    """Rule for redacting sensitive output."""
    pass

Summarization Middleware

Summarize long conversations to manage context length:

class SummarizationMiddleware(AgentMiddleware):
    """
    Automatically summarize conversation history when it becomes too long.

    Parameters:
        max_tokens: Maximum tokens before summarization
        summary_prompt: Prompt template for summarization
    """
    def __init__(
        self,
        max_tokens: int = 4000,
        summary_prompt: str | None = None
    ): ...

PII Middleware

Detect and redact personally identifiable information:

class PIIMiddleware(AgentMiddleware):
    """
    Detect and redact PII from messages.

    Parameters:
        pii_types: Types of PII to detect (email, phone, ssn, etc.)
        redact: Whether to redact or raise error
    """
    def __init__(
        self,
        pii_types: list[str],
        redact: bool = True
    ): ...

class PIIDetectionError(Exception):
    """Raised when PII is detected and redact=False."""
    pass

Todo List Middleware

Manage todo lists within agent execution:

class TodoListMiddleware(AgentMiddleware):
    """
    Track and manage todo items during agent execution.
    """
    def __init__(self): ...

Model Call Limit Middleware

Limit total number of model calls:

class ModelCallLimitMiddleware(AgentMiddleware):
    """
    Limit total number of model calls in agent execution.

    Parameters:
        max_calls: Maximum number of model calls allowed
    """
    def __init__(self, max_calls: int): ...

Context Editing Middleware

Edit message context during execution:

class ContextEditingMiddleware(AgentMiddleware):
    """
    Edit and manipulate message context during execution.

    Parameters:
        edits: List of edit operations to apply
    """
    def __init__(self, edits: list): ...

class ClearToolUsesEdit:
    """Edit operation to clear tool usage from context."""
    pass

Types

from typing import TypedDict, Callable, Any
from dataclasses import dataclass

@dataclass
class ModelRequest:
    """
    Request object passed to before_model and wrap_model_call.

    Attributes:
        state: Current agent state
        runtime: Execution runtime context
        model_settings: Model configuration settings
    """
    state: AgentState
    runtime: Any
    model_settings: Any

@dataclass
class ModelResponse:
    """
    Response object from model call, passed to after_model.

    Attributes:
        result: List of messages returned from the model
        structured_response: Structured output data (if using response_format)
    """
    result: list[BaseMessage]
    structured_response: Any = None

class ToolCallRequest(TypedDict):
    """
    Request object passed to wrap_tool_call.

    Attributes:
        tool_name: Name of tool being called
        tool_args: Arguments for tool call
        tool_call_id: Unique identifier for tool call
    """
    tool_name: str
    tool_args: dict
    tool_call_id: str

class AgentState(TypedDict):
    """
    Base state schema for agent execution.

    Attributes:
        messages: Conversation history
        structured_response: Structured output (if using response_format)
        jump_to: Control flow target (ephemeral)
    """
    messages: list[AnyMessage]
    structured_response: Any
    jump_to: str

Usage Patterns

Combining Multiple Middleware

Middleware is composable - pass a list to create_agent():

from langchain.agents import create_agent
from langchain.agents.middleware import (
    ModelRetryMiddleware,
    ToolCallLimitMiddleware,
    SummarizationMiddleware
)

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[
        ModelRetryMiddleware(max_retries=3),
        ToolCallLimitMiddleware(max_tool_calls=10),
        SummarizationMiddleware(max_tokens=4000)
    ]
)

Custom Middleware with State

from langchain.agents.middleware import before_model, AgentState

@before_model
def add_context(request: ModelRequest) -> ModelRequest:
    state = request['state']
    # Access custom state fields
    user_name = state.get('user_name', 'User')

    # Modify messages
    state['messages'].insert(0, SystemMessage(
        content=f"The user's name is {user_name}"
    ))

    return request

Conditional Execution Flow

from langchain.agents.middleware import after_model, hook_config

@after_model
@hook_config(can_jump_to=["tools", "model", "end"])
def quality_check(response: ModelResponse) -> ModelResponse:
    content = response.get('content', '')

    # Force retry if response is too short
    if len(content) < 10:
        response['state']['jump_to'] = "model"

    # Skip tools and end if no tool calls needed
    if not response.get('tool_calls'):
        response['state']['jump_to'] = "end"

    return response

Install with Tessl CLI

npx tessl i tessl/pypi-langchain

docs

advanced

dependency-injection.md

middleware.md

rate-limiting.md

structured-output.md

index.md

quickstart.md

tile.json