CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-langchain

Building applications with LLMs through composability

Pending
Overview
Eval results
Files

chat-models.mddocs/core/

Chat Models

LangChain provides a unified interface for initializing and using chat models from 20+ providers. Instead of importing provider-specific classes, you can use string identifiers like "openai:gpt-4o" or "anthropic:claude-3-5-sonnet-20241022" to initialize models. This approach simplifies switching between providers and makes code more portable.

Chat models are language models optimized for conversational interactions. They generate text responses based on message inputs and support features like tool calling, structured output, and streaming.

Initialization

Initialize chat models using the init_chat_model() factory function with string identifiers:

def init_chat_model(
    model: str | None = None,
    *,
    model_provider: str | None = None,
    configurable_fields: Literal["any"] | list[str] | tuple[str, ...] | None = None,
    config_prefix: str | None = None,
    **kwargs: Any
) -> BaseChatModel

Parameters:

  • model (str | None): Model identifier in format "provider:model-name". Examples: "openai:gpt-4o", "anthropic:claude-3-5-sonnet-20241022". Optional if provider can be inferred.
  • model_provider (str | None): Override provider detection. Useful when the provider cannot be automatically detected from the model string. Optional.
  • configurable_fields (Literal["any"] | list[str] | tuple[str, ...] | None): Which parameters can be set at runtime via config["configurable"]. Use "any" to allow all fields. Optional.
  • config_prefix (str | None): Prefix for configurable parameter names. Optional.
  • **kwargs: Provider-specific parameters (see Common Parameters below). All extra keyword arguments are passed to the provider's model class.

Returns: BaseChatModel instance or configurable model wrapper

Common Chat Model Parameters

These parameters are passed as **kwargs to init_chat_model():

  • temperature (float): Controls randomness (0.0 = deterministic, 2.0 = maximum randomness). Default varies by provider.
  • max_tokens (int): Maximum number of tokens to generate. Default varies by provider.
  • timeout (float): Request timeout in seconds.
  • max_retries (int): Maximum number of automatic retry attempts on failure.
  • base_url (str): Custom API endpoint URL. Useful for proxies or self-hosted models.
  • rate_limiter (BaseRateLimiter): Rate limiter instance to control request rate.
  • API keys and authentication (provider-specific, e.g., api_key, openai_api_key, anthropic_api_key)

Basic Chat Model Usage

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

# Initialize OpenAI model
model = init_chat_model("openai:gpt-4o")

# Generate response
response = model.invoke([
    HumanMessage(content="What is the capital of France?")
])

print(response.content)  # "The capital of France is Paris."

Chat Model with Configuration

from langchain.chat_models import init_chat_model

# Initialize with custom parameters
model = init_chat_model(
    "openai:gpt-4o",
    temperature=0.7,
    max_tokens=1000,
    timeout=30.0,
    max_retries=3
)

Configurable Fields

Make parameters configurable at runtime:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

# Make temperature configurable at runtime
model = init_chat_model(
    "openai:gpt-4o",
    configurable_fields=["temperature"],
    temperature=0.5  # default value
)

# Override at runtime
response = model.invoke(
    [HumanMessage(content="Hello")],
    config={"configurable": {"temperature": 0.9}}
)

Rate Limiting

Control request rate with rate limiters:

from langchain.chat_models import init_chat_model
from langchain.rate_limiters import InMemoryRateLimiter

# Create rate limiter (10 requests per minute)
rate_limiter = InMemoryRateLimiter(
    requests_per_second=10/60
)

model = init_chat_model(
    "openai:gpt-4o",
    rate_limiter=rate_limiter
)

Chat Model Providers

LangChain supports 20+ chat model providers. The provider is automatically detected from the model string format "provider:model-name".

Major Providers:

  • OpenAI (openai): GPT-4, GPT-3.5, O1, O3 models

    • Examples: "openai:gpt-4o", "openai:gpt-4-turbo", "openai:gpt-3.5-turbo", "openai:o1-preview"
  • Anthropic (anthropic): Claude models

    • Examples: "anthropic:claude-3-5-sonnet-20241022", "anthropic:claude-3-opus-20240229"
  • Google Vertex AI (google_vertexai): Gemini models via Google Cloud

    • Examples: "google_vertexai:gemini-1.5-pro", "google_vertexai:gemini-1.5-flash"
  • Google Generative AI (google_genai): Gemini models via Google AI Studio

    • Examples: "google_genai:gemini-1.5-pro", "google_genai:gemini-1.5-flash"
  • AWS Bedrock (bedrock, bedrock_converse): Models on AWS Bedrock

    • Examples: "bedrock:anthropic.claude-3-sonnet-20240229-v1:0", "bedrock:meta.llama3-70b-instruct-v1:0"
  • Azure OpenAI (azure_openai): OpenAI models hosted on Azure

    • Examples: "azure_openai:gpt-4o", "azure_openai:gpt-35-turbo"

Additional Providers:

  • Cohere (cohere): Command models
  • Mistral AI (mistralai): Mistral and Mixtral models
  • Groq (groq): Fast inference API
  • Ollama (ollama): Local model serving
  • HuggingFace (huggingface): HuggingFace models
  • Together AI (together): Together API
  • Fireworks (fireworks): Fireworks API
  • DeepSeek (deepseek): DeepSeek models
  • xAI (xai): Grok models
  • Perplexity (perplexity): Perplexity API
  • Upstage (upstage): Upstage models
  • IBM Watson (ibm): IBM Watson models
  • NVIDIA (nvidia): NVIDIA AI endpoints
  • Azure AI (azure_ai): Azure AI services
  • Google Anthropic Vertex (google_anthropic_vertex): Anthropic models via Vertex AI

See Provider Reference for the complete list of supported providers.

Provider Examples

Each provider has its own model naming convention. The general format is "provider:model-name", but the exact model name varies:

# OpenAI
model = init_chat_model("openai:gpt-4o")
model = init_chat_model("openai:gpt-4-turbo")
model = init_chat_model("openai:o1-preview")

# Anthropic
model = init_chat_model("anthropic:claude-3-5-sonnet-20241022")
model = init_chat_model("anthropic:claude-3-opus-20240229")

# Google
model = init_chat_model("google_vertexai:gemini-1.5-pro")
model = init_chat_model("google_genai:gemini-1.5-flash")

# AWS Bedrock (uses provider's full model ID)
model = init_chat_model("bedrock:anthropic.claude-3-sonnet-20240229-v1:0")
model = init_chat_model("bedrock:meta.llama3-70b-instruct-v1:0")

# Local models
model = init_chat_model("ollama:llama2")
model = init_chat_model("ollama:mistral")

BaseChatModel Interface

The BaseChatModel class is the base interface for all chat models. All models returned by init_chat_model() implement this interface.

class BaseChatModel:
    """
    Base class for chat models.

    All chat models support synchronous and asynchronous execution,
    streaming, and batch processing.
    """

    def invoke(
        self,
        messages: list[AnyMessage],
        **kwargs: Any
    ) -> AIMessage: ...

    async def ainvoke(
        self,
        messages: list[AnyMessage],
        **kwargs: Any
    ) -> AIMessage: ...

    def stream(
        self,
        messages: list[AnyMessage],
        **kwargs: Any
    ) -> Iterator[AIMessageChunk]: ...

    async def astream(
        self,
        messages: list[AnyMessage],
        **kwargs: Any
    ) -> AsyncIterator[AIMessageChunk]: ...

    def batch(
        self,
        messages: list[list[AnyMessage]],
        **kwargs: Any
    ) -> list[AIMessage]: ...

    async def abatch(
        self,
        messages: list[list[AnyMessage]],
        **kwargs: Any
    ) -> list[AIMessage]: ...

Methods:

  • invoke(messages, **kwargs) - Execute model synchronously and return complete response
  • ainvoke(messages, **kwargs) - Execute model asynchronously and return complete response
  • stream(messages, **kwargs) - Stream model response synchronously as chunks
  • astream(messages, **kwargs) - Stream model response asynchronously as chunks
  • batch(messages, **kwargs) - Execute multiple requests synchronously in batch
  • abatch(messages, **kwargs) - Execute multiple requests asynchronously in batch

Synchronous Execution

Generate a complete response synchronously:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage, SystemMessage

model = init_chat_model("openai:gpt-4o")

response = model.invoke([
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="What is 2 + 2?")
])

print(response.content)  # "2 + 2 equals 4."
print(response.usage_metadata)  # Token usage information

Async Execution

Generate a complete response asynchronously:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

model = init_chat_model("openai:gpt-4o")

response = await model.ainvoke([
    HumanMessage(content="Hello!")
])

print(response.content)

Streaming Execution

Stream response as it's generated:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

model = init_chat_model("openai:gpt-4o")

# Synchronous streaming
for chunk in model.stream([HumanMessage(content="Write a poem")]):
    print(chunk.content, end="", flush=True)

# Async streaming
async for chunk in model.astream([HumanMessage(content="Write a poem")]):
    print(chunk.content, end="", flush=True)

Batch Execution

Execute multiple requests in a single batch:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

model = init_chat_model("openai:gpt-4o")

# Synchronous batch
responses = model.batch([
    [HumanMessage(content="What is 2+2?")],
    [HumanMessage(content="What is 3+3?")],
    [HumanMessage(content="What is 4+4?")]
])

for response in responses:
    print(response.content)

# Async batch
responses = await model.abatch([
    [HumanMessage(content="What is 2+2?")],
    [HumanMessage(content="What is 3+3?")]
])

Tool Calling

Many models support tool calling (function calling):

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage
from langchain.tools import tool

@tool
def get_weather(location: str) -> str:
    """Get weather for a location."""
    return f"Sunny, 72°F in {location}"

model = init_chat_model("openai:gpt-4o")

# Bind tools to model
model_with_tools = model.bind_tools([get_weather])

# Model will return tool calls
response = model_with_tools.invoke([
    HumanMessage(content="What's the weather in Paris?")
])

# Check for tool calls
if response.tool_calls:
    tool_call = response.tool_calls[0]
    print(f"Tool: {tool_call['name']}")
    print(f"Args: {tool_call['args']}")

Structured Output

Request structured output from models that support it:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int
    email: str

model = init_chat_model("openai:gpt-4o")
structured_model = model.with_structured_output(Person)

response = structured_model.invoke([
    HumanMessage(content="Extract: John Doe, 30 years old, john@example.com")
])

print(response)  # Person(name="John Doe", age=30, email="john@example.com")

Model Configuration

Pass runtime configuration to model calls:

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

model = init_chat_model("openai:gpt-4o")

# Pass configuration
response = model.invoke(
    [HumanMessage(content="Hello")],
    config={
        "run_name": "my_run",
        "tags": ["production"],
        "metadata": {"user_id": "123"}
    }
)

Authentication for Chat Models

Different providers require different authentication methods:

# OpenAI (uses OPENAI_API_KEY environment variable or parameter)
model = init_chat_model("openai:gpt-4o", openai_api_key="sk-...")

# Anthropic (uses ANTHROPIC_API_KEY environment variable or parameter)
model = init_chat_model("anthropic:claude-3-5-sonnet-20241022", anthropic_api_key="sk-ant-...")

# AWS Bedrock (uses AWS credentials from environment/IAM)
model = init_chat_model("bedrock:anthropic.claude-3-sonnet-20240229-v1:0")

# Azure OpenAI (requires deployment name and endpoint)
model = init_chat_model(
    "azure_openai:gpt-4o",
    azure_deployment="my-gpt4-deployment",
    azure_endpoint="https://my-resource.openai.azure.com/",
    api_key="..."
)

# Ollama (local, no authentication required)
model = init_chat_model("ollama:llama2")

Temperature and Sampling

Control randomness and creativity:

# Deterministic (good for factual tasks)
model = init_chat_model("openai:gpt-4o", temperature=0)

# Balanced
model = init_chat_model("openai:gpt-4o", temperature=0.7)

# Creative (good for creative writing)
model = init_chat_model("openai:gpt-4o", temperature=1.5)

Custom Base URL

Use custom endpoints for proxies or self-hosted models:

model = init_chat_model(
    "openai:gpt-4o",
    base_url="https://my-proxy.example.com/v1"
)

Switching Between Providers

The string-based initialization makes it easy to switch providers:

from langchain.chat_models import init_chat_model
import os

# Get provider from environment or default to OpenAI
provider = os.getenv("LLM_PROVIDER", "openai")
model_name = os.getenv("MODEL_NAME", "gpt-4o")

model = init_chat_model(f"{provider}:{model_name}")

Types

from typing import Any, Iterator, AsyncIterator, Literal
from langchain_core.language_models import BaseChatModel
from langchain_core.messages import AnyMessage, AIMessage, AIMessageChunk
from langchain_core.rate_limiters import BaseRateLimiter

Install with Tessl CLI

npx tessl i tessl/pypi-langchain@1.2.1

docs

index.md

quickstart.md

tile.json