tessl/pypi-langchain

Building applications with LLMs through composability

—

Pending

Overview

Eval results

Files

Middleware

Name: tessl/pypi-langchain
Author: tessl

The middleware system provides powerful customization of agent behavior through lifecycle hooks and execution wrappers. Middleware allows you to intercept and modify agent execution at key points: before/after agent execution, before/after model calls, and with full control over model and tool call execution.

Middleware is composable - you can combine multiple middleware plugins to build sophisticated agent behaviors like retry logic, fallback models, human-in-the-loop workflows, and more.

Capabilities

Lifecycle Hooks

Lifecycle hooks allow you to run code at specific points in the agent execution lifecycle. Hooks receive the current state or request/response objects and can modify them before returning.

Before Agent Execution

Run code once at the start of agent execution, before any model calls:

def before_agent(func: Callable[[AgentState], AgentState]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import before_agent, AgentState

@before_agent
def log_start(state: AgentState) -> AgentState:
    print(f"Starting agent with {len(state['messages'])} messages")
    return state

Async Support:

The before_agent decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import before_agent

@before_agent
async def async_log_start(state: AgentState) -> AgentState:
    print("Starting agent execution")
    return state

Before Model Call

Run code before each model invocation. Useful for modifying prompts, logging, or controlling flow:

def before_model(func: Callable[[ModelRequest], ModelRequest]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import before_model, ModelRequest

@before_model
def log_model_call(request: ModelRequest) -> ModelRequest:
    print(f"Calling model with {len(request['state']['messages'])} messages")
    return request

With Flow Control:

from langchain.agents.middleware import before_model, hook_config

@before_model
@hook_config(can_jump_to=["tools", "model", "end"])
def conditional_skip(request: ModelRequest) -> ModelRequest:
    # Skip model call if too many messages
    if len(request['state']['messages']) > 100:
        request['state']['jump_to'] = "end"
    return request

Async Support:

The before_model decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import before_model

@before_model
async def async_before(request: ModelRequest) -> ModelRequest:
    return request

After Model Call

Run code after each model invocation. Useful for logging responses, modifying output, or controlling flow:

def after_model(func: Callable[[ModelResponse], ModelResponse]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import after_model, ModelResponse

@after_model
def log_model_response(response: ModelResponse) -> ModelResponse:
    print(f"Model returned: {response}")
    return response

With Flow Control:

from langchain.agents.middleware import after_model, hook_config

@after_model
@hook_config(can_jump_to=["tools", "model", "end"])
def force_retry(response: ModelResponse) -> ModelResponse:
    # Retry model call if response is empty
    if not response.get("content"):
        response['state']['jump_to'] = "model"
    return response

Async Support:

The after_model decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import after_model

@after_model
async def async_after(response: ModelResponse) -> ModelResponse:
    return response

After Agent Execution

Run code once at the end of agent execution, after all processing is complete:

def after_agent(func: Callable[[AgentState], AgentState]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import after_agent, AgentState

@after_agent
def log_completion(state: AgentState) -> AgentState:
    print(f"Agent completed with {len(state['messages'])} messages")
    return state

Async Support:

The after_agent decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import after_agent

@after_agent
async def async_log_completion(state: AgentState) -> AgentState:
    return state

Execution Wrappers

Execution wrappers provide complete control over model and tool execution. Unlike hooks, wrappers receive a handler callback that performs the actual execution, allowing you to implement retry logic, fallbacks, caching, and more.

Model Call Wrapper

Wrap model execution with custom logic:

def wrap_model_call(func: Callable[[Callable, ModelRequest], ModelResponse]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import wrap_model_call, ModelRequest, ModelResponse

@wrap_model_call
def retry_model(handler: Callable, request: ModelRequest) -> ModelResponse:
    """Retry model call up to 3 times on failure."""
    for attempt in range(3):
        try:
            return handler(request)
        except Exception as e:
            if attempt == 2:
                raise
            print(f"Retry {attempt + 1} after error: {e}")
    return handler(request)

Async Support:

The wrap_model_call decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import wrap_model_call

@wrap_model_call
async def async_retry_model(handler: Callable, request: ModelRequest) -> ModelResponse:
    try:
        return await handler(request)
    except Exception:
        return await handler(request)  # Retry once

Use Cases:

Retry logic on failure
Fallback to different models
Response rewriting or filtering
Caching model responses
Rate limiting

Tool Call Wrapper

Wrap tool execution with custom logic:

def wrap_tool_call(func: Callable[[Callable, ToolCallRequest], Any]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import wrap_tool_call

@wrap_tool_call
def cache_tool_calls(handler: Callable, request: ToolCallRequest) -> Any:
    """Cache tool call results."""
    cache_key = f"{request['tool_name']}:{request['tool_args']}"
    if cache_key in cache:
        return cache[cache_key]

    result = handler(request)
    cache[cache_key] = result
    return result

Async Support:

The wrap_tool_call decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import wrap_tool_call

@wrap_tool_call
async def async_wrap_tool(handler: Callable, request: ToolCallRequest) -> Any:
    return await handler(request)

Use Cases:

Tool retry on failure
Modifying tool inputs or outputs
Caching tool results
Access control for tools
Tool call logging

Dynamic Prompts

Generate system prompts dynamically based on the request context:

def dynamic_prompt(func: Callable[[ModelRequest], str]) -> AgentMiddleware: ...

Decorator Usage:

from langchain.agents.middleware import dynamic_prompt, ModelRequest

@dynamic_prompt
def time_aware_prompt(request: ModelRequest) -> str:
    """Add current time to system prompt."""
    from datetime import datetime
    current_time = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    return f"You are a helpful assistant. Current time: {current_time}"

Async Support:

The dynamic_prompt decorator automatically detects and supports async functions. Simply define your function as async:

from langchain.agents.middleware import dynamic_prompt

@dynamic_prompt
async def async_dynamic_prompt(request: ModelRequest) -> str:
    return "Dynamic system prompt"

Hook Configuration

The hook_config decorator marks valid jump destinations for flow control:

def hook_config(can_jump_to: list[str]) -> Callable: ...

Usage:

from langchain.agents.middleware import before_model, hook_config

@before_model
@hook_config(can_jump_to=["tools", "model", "end"])
def conditional_jump(request: ModelRequest) -> ModelRequest:
    if some_condition:
        request['state']['jump_to'] = "end"
    return request

Valid Jump Targets:

"tools" - Jump to tool execution
"model" - Jump to model call (useful for retries)
"end" - Jump to end of execution

Middleware Base Class

All middleware inherits from the AgentMiddleware base class:

class AgentMiddleware:
    """
    Base class for middleware plugins.

    Middleware can be created by subclassing this class or by using
    the decorator functions (before_model, after_model, etc.).
    """
    pass

Custom Middleware Class:

from langchain.agents.middleware import AgentMiddleware, ModelRequest, ModelResponse

class CustomMiddleware(AgentMiddleware):
    def __init__(self, config: dict):
        self.config = config

    def before_model(self, request: ModelRequest) -> ModelRequest:
        # Custom logic
        return request

    def after_model(self, response: ModelResponse) -> ModelResponse:
        # Custom logic
        return response

Built-in Middleware

LangChain provides several pre-built middleware classes for common use cases:

Model Retry Middleware

Automatically retry model calls on failure:

class ModelRetryMiddleware(AgentMiddleware):
    """
    Retry model calls on failure with configurable attempts and backoff.

    Parameters:
        max_retries: Maximum number of retry attempts
        backoff_factor: Exponential backoff multiplier
        retry_on: Exception types to retry on
    """
    def __init__(
        self,
        max_retries: int = 3,
        backoff_factor: float = 2.0,
        retry_on: tuple[type[Exception], ...] = (Exception,)
    ): ...

Usage:

from langchain.agents import create_agent
from langchain.agents.middleware import ModelRetryMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[ModelRetryMiddleware(max_retries=3)]
)

Model Fallback Middleware

Switch to fallback model on error:

class ModelFallbackMiddleware(AgentMiddleware):
    """
    Use fallback model if primary model fails.

    Parameters:
        fallback_models: List of fallback model identifiers to try in order
    """
    def __init__(self, fallback_models: list[str]): ...

Usage:

from langchain.agents.middleware import ModelFallbackMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[ModelFallbackMiddleware(
        fallback_models=["anthropic:claude-3-5-sonnet-20241022", "openai:gpt-3.5-turbo"]
    )]
)

Tool Call Limit Middleware

Limit the number of tool calls per execution:

class ToolCallLimitMiddleware(AgentMiddleware):
    """
    Limit the number of tool calls per agent execution.

    Parameters:
        max_tool_calls: Maximum number of tool calls allowed
    """
    def __init__(self, max_tool_calls: int): ...

Usage:

from langchain.agents.middleware import ToolCallLimitMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[ToolCallLimitMiddleware(max_tool_calls=10)]
)

Tool Retry Middleware

Retry tool calls on failure:

class ToolRetryMiddleware(AgentMiddleware):
    """
    Retry tool calls on failure.

    Parameters:
        max_retries: Maximum number of retry attempts per tool call
    """
    def __init__(self, max_retries: int = 3): ...

Usage:

from langchain.agents.middleware import ToolRetryMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[ToolRetryMiddleware(max_retries=2)]
)

Human in the Loop Middleware

Pause execution for human confirmation or input:

class HumanInTheLoopMiddleware(AgentMiddleware):
    """
    Pause agent execution for human review and approval.

    Parameters:
        interrupt_on: Configuration for when to interrupt
    """
    def __init__(self, interrupt_on: InterruptOnConfig): ...

class InterruptOnConfig:
    """Configuration for human-in-the-loop interruptions."""
    pass

Usage:

from langchain.agents.middleware import HumanInTheLoopMiddleware

agent = create_agent(
    model="openai:gpt-4o",
    middleware=[HumanInTheLoopMiddleware(interrupt_on=...)]
)

LLM Tool Emulator

Emulate tool calls using LLM when tools are not available:

class LLMToolEmulator(AgentMiddleware):
    """
    Emulate tool execution using LLM calls instead of actual tool execution.

    Useful for simulation or when tools are unavailable.
    """
    def __init__(self): ...

LLM Tool Selector Middleware

Use LLM to intelligently select which tools to use:

class LLMToolSelectorMiddleware(AgentMiddleware):
    """
    Use LLM to select relevant tools before execution.

    Useful when agent has many tools available.
    """
    def __init__(self): ...

Filesystem File Search Middleware

Search filesystem for files:

class FilesystemFileSearchMiddleware(AgentMiddleware):
    """
    Provide file search capabilities to the agent.

    Parameters:
        search_paths: Directories to search
        file_patterns: File patterns to match
    """
    def __init__(
        self,
        search_paths: list[str],
        file_patterns: list[str] = ["*"]
    ): ...

Shell Tool Middleware

Execute shell commands with security policies:

class ShellToolMiddleware(AgentMiddleware):
    """
    Allow agent to execute shell commands with execution policy controls.

    Parameters:
        execution_policy: Policy controlling what commands can be executed
        redaction_rules: Rules for redacting sensitive output
    """
    def __init__(
        self,
        execution_policy: HostExecutionPolicy | DockerExecutionPolicy | CodexSandboxExecutionPolicy,
        redaction_rules: list[RedactionRule] = []
    ): ...

class HostExecutionPolicy:
    """Execute commands on host system."""
    pass

class DockerExecutionPolicy:
    """Execute commands in Docker container."""
    pass

class CodexSandboxExecutionPolicy:
    """Execute commands in Codex sandbox."""
    pass

class RedactionRule:
    """Rule for redacting sensitive output."""
    pass

Summarization Middleware

Summarize long conversations to manage context length:

class SummarizationMiddleware(AgentMiddleware):
    """
    Automatically summarize conversation history when it becomes too long.

    Parameters:
        max_tokens: Maximum tokens before summarization
        summary_prompt: Prompt template for summarization
    """
    def __init__(
        self,
        max_tokens: int = 4000,
        summary_prompt: str | None = None
    ): ...

PII Middleware

Detect and redact personally identifiable information:

class PIIMiddleware(AgentMiddleware):
    """
    Detect and redact PII from messages.

    Parameters:
        pii_types: Types of PII to detect (email, phone, ssn, etc.)
        redact: Whether to redact or raise error
    """
    def __init__(
        self,
        pii_types: list[str],
        redact: bool = True
    ): ...

class PIIDetectionError(Exception):
    """Raised when PII is detected and redact=False."""
    pass

Todo List Middleware

Manage todo lists within agent execution:

class TodoListMiddleware(AgentMiddleware):
    """
    Track and manage todo items during agent execution.
    """
    def __init__(self): ...

Model Call Limit Middleware

Limit total number of model calls:

class ModelCallLimitMiddleware(AgentMiddleware):
    """
    Limit total number of model calls in agent execution.

    Parameters:
        max_calls: Maximum number of model calls allowed
    """
    def __init__(self, max_calls: int): ...

Context Editing Middleware

Edit message context during execution:

class ContextEditingMiddleware(AgentMiddleware):
    """
    Edit and manipulate message context during execution.

    Parameters:
        edits: List of edit operations to apply
    """
    def __init__(self, edits: list): ...

class ClearToolUsesEdit:
    """Edit operation to clear tool usage from context."""
    pass

Types

from typing import TypedDict, Callable, Any
from dataclasses import dataclass

@dataclass
class ModelRequest:
    """
    Request object passed to before_model and wrap_model_call.

    Attributes:
        state: Current agent state
        runtime: Execution runtime context
        model_settings: Model configuration settings
    """
    state: AgentState
    runtime: Any
    model_settings: Any

@dataclass
class ModelResponse:
    """
    Response object from model call, passed to after_model.

    Attributes:
        result: List of messages returned from the model
        structured_response: Structured output data (if using response_format)
    """
    result: list[BaseMessage]
    structured_response: Any = None

class ToolCallRequest(TypedDict):
    """
    Request object passed to wrap_tool_call.

    Attributes:
        tool_name: Name of tool being called
        tool_args: Arguments for tool call
        tool_call_id: Unique identifier for tool call
    """
    tool_name: str
    tool_args: dict
    tool_call_id: str

class AgentState(TypedDict):
    """
    Base state schema for agent execution.

    Attributes:
        messages: Conversation history
        structured_response: Structured output (if using response_format)
        jump_to: Control flow target (ephemeral)
    """
    messages: list[AnyMessage]
    structured_response: Any
    jump_to: str

Usage Patterns

Combining Multiple Middleware

Middleware is composable - pass a list to create_agent():

from langchain.agents import create_agent
from langchain.agents.middleware import (
    ModelRetryMiddleware,
    ToolCallLimitMiddleware,
    SummarizationMiddleware
)

agent = create_agent(
    model="openai:gpt-4o",
    tools=[...],
    middleware=[
        ModelRetryMiddleware(max_retries=3),
        ToolCallLimitMiddleware(max_tool_calls=10),
        SummarizationMiddleware(max_tokens=4000)
    ]
)

Custom Middleware with State

from langchain.agents.middleware import before_model, AgentState

@before_model
def add_context(request: ModelRequest) -> ModelRequest:
    state = request['state']
    # Access custom state fields
    user_name = state.get('user_name', 'User')

    # Modify messages
    state['messages'].insert(0, SystemMessage(
        content=f"The user's name is {user_name}"
    ))

    return request

Conditional Execution Flow

from langchain.agents.middleware import after_model, hook_config

@after_model
@hook_config(can_jump_to=["tools", "model", "end"])
def quality_check(response: ModelResponse) -> ModelResponse:
    content = response.get('content', '')

    # Force retry if response is too short
    if len(content) < 10:
        response['state']['jump_to'] = "model"

    # Skip tools and end if no tool calls needed
    if not response.get('tool_calls'):
        response['state']['jump_to'] = "end"

    return response

Install with Tessl CLI

npx tessl i tessl/pypi-langchain

docs

advanced

dependency-injection.md

middleware.md

rate-limiting.md

structured-output.md

core

patterns

reference

index.md

quickstart.md

tile.json

tessl/pypi-langchain

middleware.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/advanced/

Middleware

Capabilities

Lifecycle Hooks

Before Agent Execution

Before Model Call

After Model Call

After Agent Execution

Execution Wrappers

Model Call Wrapper

Tool Call Wrapper

Dynamic Prompts

Hook Configuration

Middleware Base Class

Built-in Middleware

Model Retry Middleware

Model Fallback Middleware

Tool Call Limit Middleware

Tool Retry Middleware

Human in the Loop Middleware

LLM Tool Emulator

LLM Tool Selector Middleware

Filesystem File Search Middleware

Shell Tool Middleware

Summarization Middleware

PII Middleware

Todo List Middleware

Model Call Limit Middleware

Context Editing Middleware

Types

Usage Patterns

Combining Multiple Middleware

Custom Middleware with State

Conditional Execution Flow

middleware.mddocs/advanced/