CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-langchain-google-genai

An integration package connecting Google's genai package and LangChain

Pending
Overview
Eval results
Files

chat-models.mddocs/

Chat Models

Advanced conversational AI interface providing access to Google's Gemini chat models with comprehensive support for tool calling, structured outputs, multimodal inputs, safety controls, and streaming responses.

Capabilities

ChatGoogleGenerativeAI

Primary chat model interface that extends LangChain's BaseChatModel to provide seamless integration with Google's Gemini models.

class ChatGoogleGenerativeAI:
    def __init__(
        self,
        *,
        model: str,
        google_api_key: Optional[SecretStr] = None,
        credentials: Any = None,
        temperature: float = 0.7,
        top_p: Optional[float] = None,
        top_k: Optional[int] = None,
        max_output_tokens: Optional[int] = None,
        n: int = 1,
        max_retries: int = 6,
        timeout: Optional[float] = None,
        client_options: Optional[Dict] = None,
        transport: Optional[str] = None,
        additional_headers: Optional[Dict[str, str]] = None,
        response_modalities: Optional[List[Modality]] = None,
        thinking_budget: Optional[int] = None,
        include_thoughts: Optional[bool] = None,
        safety_settings: Optional[Dict[HarmCategory, HarmBlockThreshold]] = None,
        convert_system_message_to_human: bool = False,
        response_mime_type: Optional[str] = None,
        response_schema: Optional[Dict[str, Any]] = None,
        cached_content: Optional[str] = None,
        model_kwargs: Dict[str, Any] = None,
        default_metadata: Optional[Sequence[Tuple[str, str]]] = None
    )

Parameters:

  • model (str): Model name (e.g., "gemini-2.5-pro", "gemini-2.0-flash")
  • google_api_key (Optional[SecretStr]): Google API key (defaults to GOOGLE_API_KEY env var)
  • credentials (Any): Google authentication credentials object
  • temperature (float): Generation temperature [0.0, 2.0], controls randomness
  • top_p (Optional[float]): Nucleus sampling parameter [0.0, 1.0]
  • top_k (Optional[int]): Top-k sampling parameter for vocabulary selection
  • max_output_tokens (Optional[int]): Maximum tokens in response
  • n (int): Number of completions to generate (default: 1)
  • max_retries (int): Maximum retry attempts for failed requests (default: 6)
  • timeout (Optional[float]): Request timeout in seconds
  • client_options (Optional[Dict]): API client configuration options
  • transport (Optional[str]): Transport method ["rest", "grpc", "grpc_asyncio"]
  • additional_headers (Optional[Dict[str, str]]): Additional HTTP headers
  • response_modalities (Optional[List[Modality]]): Response output modalities
  • thinking_budget (Optional[int]): Thinking budget in tokens for reasoning
  • include_thoughts (Optional[bool]): Include reasoning thoughts in response
  • safety_settings (Optional[Dict[HarmCategory, HarmBlockThreshold]]): Content safety configuration
  • convert_system_message_to_human (bool): Convert system messages to human messages
  • response_mime_type (Optional[str]): Expected response MIME type
  • response_schema (Optional[Dict[str, Any]]): JSON schema for structured responses
  • cached_content (Optional[str]): Cached content name for context reuse
  • model_kwargs (Dict[str, Any]): Additional model parameters to pass to the API
  • default_metadata (Optional[Sequence[Tuple[str, str]]]): Default metadata headers for requests

Core Methods

Message Generation

def invoke(
    self,
    input: LanguageModelInput,
    config: Optional[RunnableConfig] = None,
    *,
    stop: Optional[List[str]] = None,
    code_execution: Optional[bool] = None,
    **kwargs: Any
) -> BaseMessage

Generate a single response message.

Parameters:

  • input: Input messages, text, or prompt
  • config: Optional run configuration
  • stop: List of stop sequences
  • code_execution: Enable code execution capabilities
  • **kwargs: Additional generation parameters

Returns: Generated AI message

async def ainvoke(
    self,
    input: LanguageModelInput,
    config: Optional[RunnableConfig] = None,
    **kwargs: Any
) -> BaseMessage

Async version of invoke().

Streaming

def stream(
    self,
    input: LanguageModelInput,
    config: Optional[RunnableConfig] = None,
    *,
    stop: Optional[List[str]] = None,
    **kwargs: Any
) -> Iterator[ChatGenerationChunk]

Stream response chunks as they're generated.

Returns: Iterator of chat generation chunks

async def astream(
    self,
    input: LanguageModelInput,
    config: Optional[RunnableConfig] = None,
    **kwargs: Any
) -> AsyncIterator[ChatGenerationChunk]

Async version of stream().

Tool Calling

Bind Tools

def bind_tools(
    self,
    tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
    *,
    tool_config: Optional[Dict] = None,
    tool_choice: Optional[Union[str, Literal["auto", "required"]]] = None,
    **kwargs: Any
) -> Runnable

Bind tools to the model for function calling capabilities.

Parameters:

  • tools: Sequence of tools (functions, Pydantic models, or tool objects)
  • tool_config: Tool configuration options
  • tool_choice: Tool selection strategy ("auto", "required", or specific tool name)

Returns: Runnable model with bound tools

Structured Output

def with_structured_output(
    self,
    schema: Union[Dict, Type[BaseModel]],
    *,
    method: Literal["function_calling", "json_mode"] = "function_calling",
    include_raw: bool = False,
    **kwargs: Any
) -> Runnable

Configure the model to return structured output matching a schema.

Parameters:

  • schema: Output schema (dict or Pydantic model)
  • method: Output method ("function_calling" or "json_mode")
  • include_raw: Include raw response alongside structured output

Returns: Runnable model configured for structured output

Utility Methods

def get_num_tokens(self, text: str) -> int

Estimate token count for input text.

Parameters:

  • text (str): Input text to count tokens for

Returns: Estimated token count

Usage Examples

Basic Chat

from langchain_google_genai import ChatGoogleGenerativeAI

# Initialize model
llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro")

# Simple text generation
response = llm.invoke("Explain the concept of machine learning")
print(response.content)

Streaming Responses

# Stream response chunks
for chunk in llm.stream("Write a creative story about robots"):
    print(chunk.content, end="", flush=True)

Tool Calling

from pydantic import BaseModel, Field

class WeatherTool(BaseModel):
    """Get weather information for a location."""
    location: str = Field(description="The city and state")
    
def get_weather(location: str) -> str:
    return f"Weather in {location}: 72°F, sunny"

# Bind tools to model
llm_with_tools = llm.bind_tools([WeatherTool])

# Use tools in conversation
response = llm_with_tools.invoke("What's the weather like in San Francisco?")

# Process tool calls
if response.tool_calls:
    for tool_call in response.tool_calls:
        if tool_call["name"] == "WeatherTool":
            result = get_weather(tool_call["args"]["location"])
            print(result)

Structured Output

from pydantic import BaseModel

class PersonInfo(BaseModel):
    name: str
    age: int
    occupation: str

# Configure for structured output
structured_llm = llm.with_structured_output(PersonInfo)

# Get structured response
result = structured_llm.invoke("Tell me about a fictional character")
print(f"Name: {result.name}, Age: {result.age}, Job: {result.occupation}")

Safety Settings

from langchain_google_genai import HarmCategory, HarmBlockThreshold

# Configure safety settings
safe_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-pro",
    safety_settings={
        HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
        HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
    }
)

response = safe_llm.invoke("Generate content with safety controls")

Advanced Features

# Enable reasoning mode
reasoning_llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-pro",
    thinking_budget=8192,  # Budget for internal reasoning
    include_thoughts=True   # Include reasoning in response
)

response = reasoning_llm.invoke("Solve this complex math problem step by step")
print("Reasoning:", response.response_metadata.get("thoughts"))
print("Answer:", response.content)

Multimodal Inputs

from langchain_core.messages import HumanMessage

# Image analysis
message = HumanMessage(content=[
    {"type": "text", "text": "What do you see in this image?"},
    {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
])

response = llm.invoke([message])
print(response.content)

Error Handling

Handle errors appropriately:

from langchain_google_genai import ChatGoogleGenerativeAI

try:
    llm = ChatGoogleGenerativeAI(model="gemini-2.5-pro")
    response = llm.invoke("Your prompt here")
except Exception as e:
    if "safety" in str(e).lower():
        print(f"Safety filter blocked content: {e}")
    elif "rate" in str(e).lower():
        print(f"Rate limit exceeded: {e}")
    else:
        print(f"Generation error: {e}")

Install with Tessl CLI

npx tessl i tessl/pypi-langchain-google-genai

docs

aqa.md

chat-models.md

embeddings.md

index.md

llm-models.md

safety-config.md

vector-store.md

tile.json