CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-gemini-common

Common shared infrastructure for integrating Google Gemini AI models with Quarkus applications through the LangChain4j framework, providing base chat model functionality, schema mapping, and embedding model support.

Overview
Eval results
Files

chat-models.mddocs/

Chat Model Implementation

Abstract base classes for implementing Gemini chat models with support for both synchronous and streaming responses. These classes provide the foundation for integrating Gemini's language models with LangChain4j's chat model interfaces.

Capabilities

Base Gemini Chat Model

Provides common functionality for all Gemini chat models, including schema detection, MIME type computation, and configuration management. This is the foundation class for both synchronous and streaming implementations.

public abstract class BaseGeminiChatModel {
    /**
     * Creates a base Gemini chat model with the specified configuration.
     *
     * @param modelId The Gemini model identifier (e.g., "gemini-1.5-pro")
     * @param temperature Controls randomness (0.0-1.0), higher = more random
     * @param maxOutputTokens Maximum tokens in the response
     * @param topK Number of highest probability tokens to consider
     * @param topP Nucleus sampling probability threshold
     * @param responseFormat Expected response format for structured output
     * @param listeners Chat model event listeners
     * @param thinkingBudget Token budget for reasoning/thinking
     * @param includeThoughts Whether to include reasoning in responses
     * @param useGoogleSearch Whether to enable Google Search integration
     */
    public BaseGeminiChatModel(
        String modelId,
        Double temperature,
        Integer maxOutputTokens,
        Integer topK,
        Double topP,
        ResponseFormat responseFormat,
        List<ChatModelListener> listeners,
        Long thinkingBudget,
        boolean includeThoughts,
        boolean useGoogleSearch
    );

    /**
     * Detects and extracts schema from the response format.
     *
     * @param effectiveResponseFormat The response format to analyze
     * @return The detected Schema object, or null if no schema
     */
    protected Schema detectSchema(ResponseFormat effectiveResponseFormat);

    /**
     * Detects and extracts raw schema map from the response format.
     *
     * @param effectiveResponseFormat The response format to analyze
     * @return Map representation of the schema, or null if no schema
     */
    protected Map<String, Object> detectRawSchema(ResponseFormat effectiveResponseFormat);

    /**
     * Computes the appropriate MIME type for the response based on format and schema.
     *
     * @param responseFormat The response format
     * @param schema The schema object
     * @param rawSchema The raw schema map
     * @return The computed MIME type string
     */
    protected String computeMimeType(
        ResponseFormat responseFormat,
        Schema schema,
        Map<String, Object> rawSchema
    );
}

Usage Example:

// Extend BaseGeminiChatModel for custom implementations
public class CustomGeminiModel extends BaseGeminiChatModel {
    public CustomGeminiModel() {
        super(
            "gemini-1.5-flash",
            0.7,                      // temperature
            2048,                     // maxOutputTokens
            40,                       // topK
            0.95,                     // topP
            null,                     // responseFormat
            new ArrayList<>(),        // listeners
            null,                     // thinkingBudget
            false,                    // includeThoughts
            false                     // useGoogleSearch
        );
    }
}

Synchronous Chat Model

Abstract implementation of LangChain4j's ChatModel interface for Gemini. Provides synchronous request-response chat functionality with support for structured output, function calling, and thinking/reasoning capabilities.

public abstract class GeminiChatLanguageModel extends BaseGeminiChatModel implements ChatModel {
    /**
     * Creates a synchronous Gemini chat model.
     *
     * @param modelId The Gemini model identifier
     * @param temperature Controls randomness (0.0-1.0)
     * @param maxOutputTokens Maximum tokens in the response
     * @param topK Number of highest probability tokens to consider
     * @param topP Nucleus sampling probability threshold
     * @param responseFormat Expected response format for structured output
     * @param listeners Chat model event listeners
     * @param thinkingBudget Token budget for reasoning/thinking
     * @param includeThoughts Whether to include reasoning in responses
     * @param useGoogleSearch Whether to enable Google Search integration
     */
    public GeminiChatLanguageModel(
        String modelId,
        Double temperature,
        Integer maxOutputTokens,
        Integer topK,
        Double topP,
        ResponseFormat responseFormat,
        List<ChatModelListener> listeners,
        Long thinkingBudget,
        boolean includeThoughts,
        boolean useGoogleSearch
    );

    /**
     * Returns the set of capabilities supported by this model.
     *
     * @return Set of Capability enums (e.g., RESPONSE_FORMAT_JSON_SCHEMA, TOOL_CALLING)
     */
    @Override
    public Set<Capability> supportedCapabilities();

    /**
     * Processes a chat request and returns a response (deprecated method).
     *
     * @param chatRequest The chat request containing messages and configuration
     * @return ChatResponse with the model's reply
     * @deprecated Use doChat instead
     */
    @Override
    @Deprecated
    public ChatResponse chat(ChatRequest chatRequest);

    /**
     * Processes a chat request and returns a response (preferred method).
     *
     * @param chatRequest The chat request containing messages and configuration
     * @return ChatResponse with the model's reply, token usage, and finish reason
     */
    @Override
    public ChatResponse doChat(ChatRequest chatRequest);

    /**
     * Subclass-specific implementation for calling the Gemini API endpoint.
     * Must be implemented by concrete subclasses to handle the actual API call.
     *
     * @param request The Gemini-formatted content generation request
     * @return The response from the Gemini API
     */
    protected abstract GenerateContentResponse generateContext(GenerateContentRequest request);
}

Usage Example:

public class VertexAIGeminiChatModel extends GeminiChatLanguageModel {

    private final GeminiRestApi restApi;

    public VertexAIGeminiChatModel(String projectId, String location, String apiKey) {
        super(
            "gemini-1.5-pro",
            0.7,
            4096,
            40,
            0.9,
            null,
            Collections.emptyList(),
            1000L,                    // thinkingBudget
            true,                     // includeThoughts
            false                     // useGoogleSearch
        );
        this.restApi = createRestApi(projectId, location, apiKey);
    }

    @Override
    protected GenerateContentResponse generateContext(GenerateContentRequest request) {
        // Call Vertex AI endpoint
        return restApi.generateContent(request);
    }
}

// Using the model
VertexAIGeminiChatModel model = new VertexAIGeminiChatModel(projectId, location, apiKey);

ChatRequest request = ChatRequest.builder()
    .messages(List.of(
        UserMessage.from("What is the capital of France?")
    ))
    .build();

ChatResponse response = model.doChat(request);
System.out.println(response.aiMessage().text());

Streaming Chat Model

Abstract implementation of LangChain4j's StreamingChatModel interface for Gemini. Provides streaming chat functionality where responses are delivered incrementally via Server-Sent Events (SSE).

public abstract class GeminiStreamingChatLanguageModel extends BaseGeminiChatModel implements StreamingChatModel {
    /**
     * Creates a streaming Gemini chat model.
     *
     * @param modelId The Gemini model identifier
     * @param temperature Controls randomness (0.0-1.0)
     * @param maxOutputTokens Maximum tokens in the response
     * @param topK Number of highest probability tokens to consider
     * @param topP Nucleus sampling probability threshold
     * @param responseFormat Expected response format for structured output
     * @param listeners Chat model event listeners
     * @param useGoogleSearch Whether to enable Google Search integration
     */
    public GeminiStreamingChatLanguageModel(
        String modelId,
        Double temperature,
        Integer maxOutputTokens,
        Integer topK,
        Double topP,
        ResponseFormat responseFormat,
        List<ChatModelListener> listeners,
        boolean useGoogleSearch
    );

    /**
     * Returns the set of capabilities supported by this streaming model.
     *
     * @return Set of Capability enums
     */
    @Override
    public Set<Capability> supportedCapabilities();

    /**
     * Processes a chat request with streaming response delivery.
     *
     * @param chatRequest The chat request containing messages and configuration
     * @param handler Handler for streaming response events (onNext, onComplete, onError)
     */
    @Override
    public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler);

    /**
     * Subclass-specific implementation for calling the Gemini streaming API endpoint.
     * Must return a Multi (reactive stream) of Server-Sent Events containing responses.
     *
     * @param request The Gemini-formatted content generation request
     * @return Reactive stream of SSE events with incremental responses
     */
    protected abstract Multi<SseEvent<GenerateContentResponse>> generateStreamContext(GenerateContentRequest request);
}

Usage Example:

public class AIStudioGeminiStreamingModel extends GeminiStreamingChatLanguageModel {

    private final GeminiStreamingRestApi restApi;

    public AIStudioGeminiStreamingModel(String apiKey) {
        super(
            "gemini-1.5-flash",
            0.9,
            2048,
            null,
            null,
            null,
            Collections.emptyList(),
            false                     // useGoogleSearch
        );
        this.restApi = createStreamingRestApi(apiKey);
    }

    @Override
    protected Multi<SseEvent<GenerateContentResponse>> generateStreamContext(GenerateContentRequest request) {
        // Call AI Studio streaming endpoint
        return restApi.generateContentStream(request);
    }
}

// Using the streaming model
AIStudioGeminiStreamingModel model = new AIStudioGeminiStreamingModel(apiKey);

ChatRequest request = ChatRequest.builder()
    .messages(List.of(
        UserMessage.from("Write a short story about a robot.")
    ))
    .build();

model.doChat(request, new StreamingChatResponseHandler() {
    @Override
    public void onNext(String token) {
        System.out.print(token);
    }

    @Override
    public void onComplete(ChatResponse response) {
        System.out.println("\n\nStreaming complete!");
        System.out.println("Total tokens: " + response.tokenUsage().totalTokenCount());
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
});

Supported Capabilities

Both synchronous and streaming chat models support the following LangChain4j capabilities:

  • RESPONSE_FORMAT_JSON_SCHEMA: Structured JSON output with schema validation
  • TOOL_CALLING: Function calling capabilities

These capabilities are reported via the supportedCapabilities() method and enable advanced features like structured output and tool use in LangChain4j applications.

Configuration Notes

  • Temperature: Controls randomness. 0.0 = deterministic, 1.0 = maximum randomness
  • TopK: Limits vocabulary to K most likely tokens. Null = no limit
  • TopP: Nucleus sampling threshold. 0.9 = consider tokens covering 90% probability mass
  • ThinkingBudget: Token allocation for model's internal reasoning (Gemini 2.0+ models)
  • IncludeThoughts: Whether to return the model's reasoning process in the response
  • UseGoogleSearch: Enables real-time web search grounding for responses

Integration Points

These abstract classes integrate with:

  • LangChain4j Core: Implements standard ChatModel and StreamingChatModel interfaces
  • Quarkus REST Client: Used by subclasses to call Gemini API endpoints
  • ContentMapper: Converts LangChain4j messages to Gemini request format
  • GenerateContentResponseHandler: Extracts text, tokens, and tool calls from responses

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-gemini-common

docs

chat-models.md

configuration.md

content-types.md

embedding-models.md

function-calling.md

index.md

requests-responses.md

utilities.md

tile.json