tessl/maven-dev-langchain4j--langchain4j-google-ai-gemini

LangChain4j integration for Google AI Gemini models providing chat, streaming, embeddings, image generation, and batch processing capabilities

Overview

Eval results

Files

Chat Models - Synchronous

Name: tessl/maven-dev-langchain4j--langchain4j-google-ai-gemini
Author: tessl

Synchronous chat model for Google AI Gemini supporting multimodal inputs (text, images, video, audio, PDFs), function calling with parallel execution, structured outputs with JSON schema validation, code execution in sandboxed Python environments, and extended reasoning through thinking mode.

Capabilities

GoogleAiGeminiChatModel Class

Main synchronous chat model class providing blocking chat interactions with comprehensive configuration options.

/**
 * Synchronous chat model for Google AI Gemini.
 * Supports multimodal inputs, function calling, structured outputs, and advanced features.
 */
public class GoogleAiGeminiChatModel {
    /**
     * Creates a new builder for configuring the chat model.
     * @return GoogleAiGeminiChatModelBuilder instance
     */
    public static GoogleAiGeminiChatModelBuilder builder();

    /**
     * Gets the default request parameters configured for this model.
     * @return ChatRequestParameters containing default configuration
     */
    public ChatRequestParameters defaultRequestParameters();

    /**
     * Sends a chat request and blocks until response is received.
     * @param chatRequest The chat request containing messages and configuration
     * @return ChatResponse with the model's response
     */
    public ChatResponse doChat(ChatRequest chatRequest);

    /**
     * Returns the set of capabilities supported by this model.
     * @return Set of Capability enums
     */
    public Set<Capability> supportedCapabilities();

    /**
     * Returns the list of registered chat model listeners.
     * @return List of ChatModelListener instances
     */
    public List<ChatModelListener> listeners();

    /**
     * Returns the model provider (GOOGLE).
     * @return ModelProvider enum value
     */
    public ModelProvider provider();
}

GoogleAiGeminiChatModelBuilder

Builder class for constructing GoogleAiGeminiChatModel with extensive configuration options.

/**
 * Builder for GoogleAiGeminiChatModel.
 * Extends base builder with synchronous-specific configuration.
 */
public static class GoogleAiGeminiChatModelBuilder
    extends GoogleAiGeminiChatModelBaseBuilder<GoogleAiGeminiChatModelBuilder> {

    /**
     * Sets the maximum number of retry attempts for failed requests.
     * @param maxRetries Number of retries (default: 3)
     * @return Builder instance for chaining
     */
    public GoogleAiGeminiChatModelBuilder maxRetries(Integer maxRetries);

    /**
     * Sets the supported capabilities for this model.
     * @param capabilities Set of Capability enums
     * @return Builder instance for chaining
     */
    public GoogleAiGeminiChatModelBuilder supportedCapabilities(Set<Capability> capabilities);

    /**
     * Sets the supported capabilities for this model using varargs.
     * @param capabilities Capability enums as varargs
     * @return Builder instance for chaining
     */
    public GoogleAiGeminiChatModelBuilder supportedCapabilities(Capability... capabilities);

    /**
     * Builds the GoogleAiGeminiChatModel instance.
     * @return Configured GoogleAiGeminiChatModel
     * @throws IllegalArgumentException if required fields are missing
     */
    public GoogleAiGeminiChatModel build();
}

Base Builder Configuration

Configuration methods inherited from GoogleAiGeminiChatModelBaseBuilder, available for all chat model types.

/**
 * Base builder class providing common configuration for all Gemini chat models.
 * @param <B> The builder type for fluent chaining
 */
public abstract class GoogleAiGeminiChatModelBaseBuilder<B> {
    // Authentication and Connection
    public B httpClientBuilder(HttpClientBuilder httpClientBuilder);
    public B apiKey(String apiKey); // Required
    public B baseUrl(String baseUrl);
    public B timeout(Duration timeout);

    // Model Selection and Parameters
    public B modelName(String modelName);
    public B defaultRequestParameters(ChatRequestParameters defaultRequestParameters);
    public B temperature(Double temperature); // 0.0 to 2.0
    public B topK(Integer topK);
    public B topP(Double topP);
    public B seed(Integer seed);
    public B frequencyPenalty(Double frequencyPenalty);
    public B presencePenalty(Double presencePenalty);
    public B maxOutputTokens(Integer maxOutputTokens);
    public B stopSequences(List<String> stopSequences);

    // Logging and Monitoring
    public B listeners(List<ChatModelListener> listeners);
    public B logRequestsAndResponses(Boolean logRequestsAndResponses);
    public B logRequests(Boolean logRequests);
    public B logResponses(Boolean logResponses);
    public B logger(Logger logger);

    // Function Calling
    public B toolConfig(GeminiFunctionCallingConfig toolConfig);
    public B toolConfig(GeminiMode mode, String... allowedFunctionNames);

    // Safety Settings
    public B safetySettings(Map<GeminiHarmCategory, GeminiHarmBlockThreshold> safetySettings);
    public B safetySettings(List<GeminiSafetySetting> safetySettings);

    // Response Format and Structure
    public B responseFormat(ResponseFormat responseFormat); // For structured outputs
    public B responseLogprobs(Boolean responseLogprobs);
    public B logprobs(Integer logprobs); // Number of log probabilities to return

    // Code Execution
    public B allowCodeExecution(Boolean allowCodeExecution); // Enable Python sandbox
    public B includeCodeExecutionOutput(Boolean includeCodeExecutionOutput);

    // Grounding and Search
    public B allowGoogleSearch(Boolean allowGoogleSearch);
    public B allowGoogleMaps(Boolean allowGoogleMaps);
    public B retrieveGoogleMapsWidgetToken(Boolean retrieveGoogleMapsWidgetToken);
    public B allowUrlContext(Boolean allowUrlContext);

    // Thinking Mode (Extended Reasoning)
    public B thinkingConfig(GeminiThinkingConfig thinkingConfig);
    public B returnThinking(Boolean returnThinking);
    public B sendThinking(Boolean sendThinking);

    // Media Processing
    public B mediaResolution(GeminiMediaResolutionLevel mediaResolution);
    public B mediaResolutionPerPartEnabled(Boolean mediaResolutionPerPartEnabled);

    // Other
    public B enableEnhancedCivicAnswers(Boolean enableEnhancedCivicAnswers);
}

Usage Examples

Basic Chat Interaction

import dev.langchain4j.model.googleai.GoogleAiGeminiChatModel;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;

// Create a basic chat model
ChatLanguageModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .build();

// Send a message
String response = model.generate("Explain quantum computing in simple terms");
System.out.println(response);

Multimodal Chat with Images

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.ImageContent;
import dev.langchain4j.data.message.TextContent;
import dev.langchain4j.data.image.Image;

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-flash")
    .mediaResolution(GeminiMediaResolutionLevel.MEDIA_RESOLUTION_HIGH)
    .build();

// Create multimodal message with image
Image image = Image.fromUrl("https://example.com/chart.png");
UserMessage message = UserMessage.from(
    TextContent.from("What does this chart show?"),
    ImageContent.from(image)
);

Response<AiMessage> response = model.generate(message);
System.out.println(response.content().text());

Function Calling

import dev.langchain4j.agent.tool.Tool;
import dev.langchain4j.service.AiServices;

class WeatherService {
    @Tool("Get the weather for a location")
    String getWeather(String location) {
        return "Sunny, 22°C in " + location;
    }
}

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .toolConfig(GeminiMode.AUTO) // Automatically choose when to call functions
    .build();

interface Assistant {
    String chat(String userMessage);
}

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .tools(new WeatherService())
    .build();

String response = assistant.chat("What's the weather in Paris?");
System.out.println(response); // Will call getWeather function

Structured Output with JSON Schema

import dev.langchain4j.model.output.structured.Description;

record Person(
    @Description("The person's full name") String name,
    @Description("The person's age in years") int age,
    @Description("The person's occupation") String occupation
) {}

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .responseFormat(ResponseFormat.builder()
        .type(ResponseFormatType.JSON)
        .jsonSchema(JsonSchema.from(Person.class))
        .build())
    .build();

String prompt = "Extract information: John Smith is a 35-year-old software engineer.";
Response<AiMessage> response = model.generate(prompt);

// Response will be valid JSON matching Person schema
Person person = new Gson().fromJson(response.content().text(), Person.class);
System.out.println(person.name()); // "John Smith"

Thinking Mode for Complex Reasoning

import dev.langchain4j.model.googleai.GeminiThinkingConfig;
import dev.langchain4j.model.googleai.GeminiThinkingConfig.GeminiThinkingLevel;

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.0-flash-thinking-exp")
    .thinkingConfig(GeminiThinkingConfig.builder()
        .includeThoughts(true) // Return thinking process
        .thinkingLevel(GeminiThinkingLevel.HIGH)
        .build())
    .build();

String problem = "If a train leaves NYC at 2pm going 80mph, and another leaves " +
                 "Boston at 3pm going 100mph, when and where do they meet?";

Response<AiMessage> response = model.generate(problem);
// Response will include extended reasoning process
System.out.println(response.content().text());

Code Execution

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .allowCodeExecution(true) // Enable Python code execution
    .includeCodeExecutionOutput(true)
    .build();

String prompt = "Calculate the factorial of 20 using Python";
Response<AiMessage> response = model.generate(prompt);
// Model can execute Python code in sandboxed environment
System.out.println(response.content().text());

Safety Settings

import dev.langchain4j.model.googleai.GeminiSafetySetting;
import dev.langchain4j.model.googleai.GeminiHarmCategory;
import dev.langchain4j.model.googleai.GeminiHarmBlockThreshold;

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .safetySettings(List.of(
        new GeminiSafetySetting(
            GeminiHarmCategory.HARM_CATEGORY_HARASSMENT,
            GeminiHarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
        ),
        new GeminiSafetySetting(
            GeminiHarmCategory.HARM_CATEGORY_HATE_SPEECH,
            GeminiHarmBlockThreshold.BLOCK_LOW_AND_ABOVE
        )
    ))
    .build();

// Model will block content based on safety settings
Response<AiMessage> response = model.generate("Your message here");

Advanced Configuration

import java.time.Duration;

GoogleAiGeminiChatModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .temperature(0.9)
    .topK(40)
    .topP(0.95)
    .maxOutputTokens(2048)
    .stopSequences(List.of("END", "STOP"))
    .timeout(Duration.ofSeconds(60))
    .maxRetries(3)
    .allowGoogleSearch(true) // Enable web search grounding
    .mediaResolution(GeminiMediaResolutionLevel.MEDIA_RESOLUTION_HIGH)
    .logRequestsAndResponses(true)
    .build();

Response<AiMessage> response = model.generate("Your complex query here");

Common Model Names

gemini-2.5-pro - Latest flagship model with advanced capabilities
gemini-2.5-flash - Fast, efficient model for high-throughput tasks
gemini-2.5-flash-8b - Lightweight model for simple tasks
gemini-2.0-flash-exp - Experimental flash model
gemini-2.0-flash-thinking-exp-01-21 - Experimental thinking model
gemini-3-pro-preview - Preview of next-generation model

Error Handling

The chat model throws standard LangChain4j exceptions:

IllegalArgumentException - Invalid configuration (missing API key, invalid parameters)
RuntimeException - API errors, network failures, timeout exceeded
Content may be blocked by safety filters - check response metadata

Integration with LangChain4j

GoogleAiGeminiChatModel implements the ChatLanguageModel interface and can be used anywhere a LangChain4j chat model is expected:

import dev.langchain4j.service.AiServices;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;

interface Assistant {
    String chat(String message);
}

ChatLanguageModel model = GoogleAiGeminiChatModel.builder()
    .apiKey(System.getenv("GOOGLE_AI_API_KEY"))
    .modelName("gemini-2.5-pro")
    .build();

Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(model)
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .build();

String response = assistant.chat("Hello!");

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-google-ai-gemini@1.11.0

docs

advanced-configuration.md