CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-gemini-common

Common shared infrastructure for integrating Google Gemini AI models with Quarkus applications through the LangChain4j framework, providing base chat model functionality, schema mapping, and embedding model support.

Overview
Eval results
Files

requests-responses.mddocs/

Request and Response Types

Core request and response types for the Gemini API. These types define the structure of API calls for content generation and embeddings, including configuration options, system instructions, tool definitions, and response metadata.

Capabilities

GenerateContentRequest

The primary request type for generating content from Gemini models. Contains conversation history, system instructions, available tools, and generation configuration.

/**
 * Request for generating content from the model.
 *
 * @param contents List of conversation messages (user and model turns)
 * @param systemInstruction Optional system-level instructions for the model
 * @param tools Optional list of tools (functions, Google Search) available to the model
 * @param generationConfig Optional configuration for generation parameters
 */
public record GenerateContentRequest(
    List<Content> contents,
    SystemInstruction systemInstruction,
    List<Tool> tools,
    GenerationConfig generationConfig
);

SystemInstruction

System-level instructions that guide the model's behavior throughout the conversation. Unlike regular content messages, system instructions are not part of the conversational back-and-forth.

/**
 * System instruction for the model.
 *
 * @param parts List of instruction parts (typically text)
 */
public record SystemInstruction(List<Part> parts) {
    /**
     * Creates a system instruction from text strings.
     *
     * @param instructions List of instruction strings
     * @return SystemInstruction with text parts
     */
    public static SystemInstruction ofContent(List<String> instructions);
}

/**
 * Individual part of a system instruction.
 *
 * @param text The instruction text
 */
public record Part(String text);

Tool

Represents tools available to the model, including function declarations and Google Search capabilities.

/**
 * Tool definition for the model.
 *
 * @param functionDeclarations List of function declarations available to call
 * @param googleSearch Marker for enabling standard Google Search
 * @param googleSearchRetrieval Marker for enabling Google Search retrieval
 */
public record Tool(
    List<FunctionDeclaration> functionDeclarations,
    GoogleSearch googleSearch,
    GoogleSearchRetrieval googleSearchRetrieval
) {
    /**
     * Creates a tool from function declarations.
     *
     * @param declarations List of available functions
     * @return Tool with function declarations
     */
    public static Tool ofFunctionDeclarations(List<FunctionDeclaration> declarations);

    /**
     * Creates a Google Search tool.
     *
     * @return Tool with Google Search enabled
     */
    public static Tool ofGoogleSearch();

    /**
     * Creates a Google Search retrieval tool.
     *
     * @return Tool with Google Search retrieval enabled
     */
    public static Tool ofGoogleSearchRetrieval();
}

/**
 * Marker record for Google Search capability.
 */
public record GoogleSearch();

/**
 * Marker record for Google Search retrieval capability.
 */
public record GoogleSearchRetrieval();

GenerateContentResponse

Response from the model containing generated candidates, token usage metadata, and model version information.

/**
 * Response from content generation.
 *
 * @param candidates List of generated response candidates
 * @param usageMetadata Token usage information
 * @param modelVersion Version of the model that generated the response
 * @param responseId Unique identifier for this response
 */
public record GenerateContentResponse(
    List<Candidate> candidates,
    UsageMetadata usageMetadata,
    String modelVersion,
    String responseId
);

Candidate

A single response candidate containing the generated content and metadata about why generation stopped.

/**
 * A response candidate from the model.
 *
 * @param content The generated content
 * @param finishReason The reason generation stopped
 */
public record Candidate(Content content, FinishReason finishReason);

/**
 * Content within a candidate.
 *
 * @param parts List of content parts
 */
public record Content(List<Part> parts);

/**
 * Individual part within candidate content.
 *
 * @param text Generated text content
 * @param functionCall Function call requested by the model
 * @param thought Whether this part is a thought/reasoning (true) or regular content (false)
 * @param thoughtSignature Signature for the thought
 */
public record Part(
    String text,
    FunctionCall functionCall,
    Boolean thought,
    String thoughtSignature
);

UsageMetadata

Token usage information for a generation request and response.

/**
 * Token usage metadata for the request and response.
 *
 * @param promptTokenCount Number of tokens in the prompt
 * @param candidatesTokenCount Number of tokens in all candidates
 * @param totalTokenCount Total tokens (prompt + candidates)
 */
public record UsageMetadata(
    int promptTokenCount,
    int candidatesTokenCount,
    int totalTokenCount
);

FinishReason

Enumeration of reasons why the model stopped generating content.

/**
 * Reason why the model stopped generating content.
 */
public enum FinishReason {
    /** Finish reason not specified */
    FINISH_REASON_UNSPECIFIED,

    /** Natural stopping point (model completed its response) */
    STOP,

    /** Reached maximum token limit */
    MAX_TOKENS,

    /** Stopped due to safety filters */
    SAFETY,

    /** Stopped due to recitation detection */
    RECITATION,

    /** Stopped due to language issues */
    LANGUAGE,

    /** Stopped for other reasons */
    OTHER,

    /** Stopped due to blocklist match */
    BLOCKLIST,

    /** Stopped due to prohibited content */
    PROHIBITED_CONTENT,

    /** Stopped due to sensitive personally identifiable information */
    SPII,

    /** Stopped due to malformed function call */
    MALFORMED_FUNCTION_CALL,

    /** Stopped due to image safety concerns */
    IMAGE_SAFETY,

    /** Stopped due to unexpected tool call */
    UNEXPECTED_TOOL_CALL,

    /** Stopped due to too many tool calls */
    TOO_MANY_TOOL_CALLS
}

EmbedContentRequest

Request for generating embeddings from text content.

/**
 * Request for embedding content.
 *
 * @param model Model identifier (e.g., "text-embedding-004")
 * @param content Content to embed
 * @param taskType Type of embedding task
 * @param title Optional title for document embeddings
 * @param outputDimensionality Optional output dimension size
 */
public record EmbedContentRequest(
    String model,
    Content content,
    TaskType taskType,
    String title,
    Integer outputDimensionality
);

TaskType

Enumeration of embedding task types that optimize the embedding for specific use cases.

/**
 * Type of embedding task.
 */
public enum TaskType {
    /** Task type not specified */
    TASK_TYPE_UNSPECIFIED,

    /** Embedding will be used for search queries */
    RETRIEVAL_QUERY,

    /** Embedding will be used for documents in a search corpus */
    RETRIEVAL_DOCUMENT,

    /** Embedding will be used for semantic similarity comparison */
    SEMANTIC_SIMILARITY,

    /** Embedding will be used for classification tasks */
    CLASSIFICATION,

    /** Embedding will be used for clustering tasks */
    CLUSTERING,

    /** Embedding will be used for question answering */
    QUESTION_ANSWERING,

    /** Embedding will be used for fact verification */
    FACT_VERIFICATION
}

EmbedContentResponse

Response containing the generated embedding vector.

/**
 * Response from embedding content.
 *
 * @param embedding The generated embedding
 */
public record EmbedContentResponse(Embedding embedding);

/**
 * Embedding vector.
 *
 * @param values Array of floating point values representing the embedding
 */
public record Embedding(float[] values);

EmbedContentRequests

Batch request for embedding multiple pieces of content in a single API call.

/**
 * Batch request for embedding multiple contents.
 *
 * @param requests List of individual embedding requests
 */
public record EmbedContentRequests(List<EmbedContentRequest> requests);

EmbedContentResponses

Batch response containing embeddings for multiple pieces of content.

/**
 * Batch response with multiple embeddings.
 *
 * @param embeddings List of embeddings corresponding to the batch requests
 */
public record EmbedContentResponses(List<Embedding> embeddings);

Usage Examples

Basic Content Generation Request

// Simple text-only request
GenerateContentRequest request = new GenerateContentRequest(
    List.of(
        new Content("user", List.of(Content.Part.ofText("What is quantum computing?")))
    ),
    null,  // no system instruction
    null,  // no tools
    null   // default generation config
);

GenerateContentResponse response = chatModel.generateContext(request);

// Extract the text response
String text = GenerateContentResponseHandler.getText(response);
System.out.println(text);

// Check token usage
UsageMetadata usage = response.usageMetadata();
System.out.println("Tokens used: " + usage.totalTokenCount());

Request with System Instruction

// Create system instruction
SystemInstruction systemInstruction = GenerateContentRequest.SystemInstruction.ofContent(
    List.of(
        "You are a helpful astronomy expert.",
        "Always provide accurate scientific information.",
        "Use analogies to make complex concepts accessible."
    )
);

// Create request with system instruction
GenerateContentRequest request = new GenerateContentRequest(
    List.of(
        new Content("user", List.of(Content.Part.ofText("Explain black holes.")))
    ),
    systemInstruction,
    null,
    null
);

GenerateContentResponse response = chatModel.generateContext(request);

Request with Generation Config

// Configure generation parameters
GenerationConfig config = GenerationConfig.builder()
    .temperature(0.7)
    .maxOutputTokens(1024)
    .topK(40)
    .topP(0.95)
    .stopSequences(List.of("\n\n", "END"))
    .build();

GenerateContentRequest request = new GenerateContentRequest(
    List.of(
        new Content("user", List.of(Content.Part.ofText("Write a haiku about coding.")))
    ),
    null,
    null,
    config
);

GenerateContentResponse response = chatModel.generateContext(request);

Request with Function Calling

// Declare a function
FunctionDeclaration getWeather = new FunctionDeclaration(
    "get_weather",
    "Get current weather for a location",
    FunctionDeclaration.Parameters.objectType(
        Map.of(
            "location", Map.of("type", "string", "description", "City name")
        ),
        List.of("location")
    )
);

// Create tool with function
GenerateContentRequest.Tool weatherTool =
    GenerateContentRequest.Tool.ofFunctionDeclarations(List.of(getWeather));

// Request with tool
GenerateContentRequest request = new GenerateContentRequest(
    List.of(
        new Content("user", List.of(Content.Part.ofText("What's the weather in Tokyo?")))
    ),
    null,
    List.of(weatherTool),
    null
);

GenerateContentResponse response = chatModel.generateContext(request);

// Check if model wants to call the function
if (!response.candidates().isEmpty()) {
    GenerateContentResponse.Candidate candidate = response.candidates().get(0);

    if (candidate.finishReason() == FinishReason.STOP) {
        // Normal text response
        String text = candidate.content().parts().get(0).text();
        System.out.println(text);
    } else {
        // Check for function call
        for (GenerateContentResponse.Candidate.Part part : candidate.content().parts()) {
            if (part.functionCall() != null) {
                FunctionCall call = part.functionCall();
                System.out.println("Function: " + call.name());
                System.out.println("Args: " + call.args());

                // Execute function and send response back
                // (See function-calling.md for complete flow)
            }
        }
    }
}

Request with Google Search

// Enable Google Search for real-time information
GenerateContentRequest.Tool searchTool = GenerateContentRequest.Tool.ofGoogleSearch();

GenerateContentRequest request = new GenerateContentRequest(
    List.of(
        new Content("user", List.of(
            Content.Part.ofText("What are the latest developments in fusion energy?")
        ))
    ),
    null,
    List.of(searchTool),
    null
);

GenerateContentResponse response = chatModel.generateContext(request);
String answer = GenerateContentResponseHandler.getText(response);

Multi-Turn Conversation

List<Content> conversation = new ArrayList<>();

// First turn
conversation.add(new Content(
    "user",
    List.of(Content.Part.ofText("I'm learning Java. Where should I start?"))
));

GenerateContentRequest request1 = new GenerateContentRequest(conversation, null, null, null);
GenerateContentResponse response1 = chatModel.generateContext(request1);

// Add model's response to conversation
String modelReply1 = response1.candidates().get(0).content().parts().get(0).text();
conversation.add(new Content(
    "model",
    List.of(Content.Part.ofText(modelReply1))
));

// Second turn
conversation.add(new Content(
    "user",
    List.of(Content.Part.ofText("Can you explain object-oriented programming?"))
));

GenerateContentRequest request2 = new GenerateContentRequest(conversation, null, null, null);
GenerateContentResponse response2 = chatModel.generateContext(request2);

Handling Different Finish Reasons

GenerateContentResponse response = chatModel.generateContext(request);

for (GenerateContentResponse.Candidate candidate : response.candidates()) {
    switch (candidate.finishReason()) {
        case STOP:
            // Normal completion
            String text = candidate.content().parts().get(0).text();
            System.out.println("Response: " + text);
            break;

        case MAX_TOKENS:
            // Response truncated due to token limit
            System.out.println("Response truncated. Consider increasing maxOutputTokens.");
            break;

        case SAFETY:
            // Blocked by safety filters
            System.out.println("Response blocked by safety filters.");
            break;

        case MALFORMED_FUNCTION_CALL:
            // Function calling error
            System.out.println("Model made a malformed function call.");
            break;

        default:
            System.out.println("Generation stopped: " + candidate.finishReason());
    }
}

Embedding Single Text

// Create embedding request
EmbedContentRequest embedRequest = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("Machine learning is fascinating."))),
    EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
    null,  // no title
    null   // default dimensionality
);

EmbedContentResponse embedResponse = embeddingModel.embedContent(embedRequest);

// Access embedding vector
float[] vector = embedResponse.embedding().values();
System.out.println("Embedding dimension: " + vector.length);

Embedding with Task Type Optimization

// For search query
EmbedContentRequest queryRequest = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("best restaurants in Paris"))),
    EmbedContentRequest.TaskType.RETRIEVAL_QUERY,
    null,
    null
);

// For document in search corpus
EmbedContentRequest docRequest = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText(
        "Le Bernardin is a acclaimed French seafood restaurant..."
    ))),
    EmbedContentRequest.TaskType.RETRIEVAL_DOCUMENT,
    "Le Bernardin Restaurant Review",  // title for document
    null
);

Batch Embedding

// Create multiple embedding requests
List<EmbedContentRequest> requests = List.of(
    new EmbedContentRequest(
        "text-embedding-004",
        new Content(null, List.of(Content.Part.ofText("First document text"))),
        EmbedContentRequest.TaskType.CLUSTERING,
        null,
        null
    ),
    new EmbedContentRequest(
        "text-embedding-004",
        new Content(null, List.of(Content.Part.ofText("Second document text"))),
        EmbedContentRequest.TaskType.CLUSTERING,
        null,
        null
    ),
    new EmbedContentRequest(
        "text-embedding-004",
        new Content(null, List.of(Content.Part.ofText("Third document text"))),
        EmbedContentRequest.TaskType.CLUSTERING,
        null,
        null
    )
);

// Batch request
EmbedContentRequests batchRequest = new EmbedContentRequests(requests);
EmbedContentResponses batchResponse = embeddingModel.batchEmbedContents(batchRequest);

// Process all embeddings
for (int i = 0; i < batchResponse.embeddings().size(); i++) {
    float[] vector = batchResponse.embeddings().get(i).values();
    System.out.println("Embedding " + i + ": " + vector.length + " dimensions");
}

Embedding with Custom Dimensionality

// Request smaller embedding dimension for efficiency
EmbedContentRequest request = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("Sample text for embedding"))),
    EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
    null,
    256  // reduce from default 768 to 256 dimensions
);

EmbedContentResponse response = embeddingModel.embedContent(request);
System.out.println("Dimension: " + response.embedding().values().length);

Complete Request with All Options

// System instruction
SystemInstruction systemInstruction = GenerateContentRequest.SystemInstruction.ofContent(
    List.of("You are a helpful assistant specialized in software development.")
);

// Tools
FunctionDeclaration searchDocs = new FunctionDeclaration(
    "search_documentation",
    "Search technical documentation",
    FunctionDeclaration.Parameters.objectType(
        Map.of("query", Map.of("type", "string")),
        List.of("query")
    )
);
GenerateContentRequest.Tool docTool =
    GenerateContentRequest.Tool.ofFunctionDeclarations(List.of(searchDocs));
GenerateContentRequest.Tool searchTool = GenerateContentRequest.Tool.ofGoogleSearch();

// Generation config
GenerationConfig config = GenerationConfig.builder()
    .temperature(0.7)
    .maxOutputTokens(2048)
    .topK(40)
    .topP(0.95)
    .thinkingConfig(new ThinkingConfig(1000L, true))
    .build();

// Conversation
List<Content> conversation = List.of(
    new Content("user", List.of(
        Content.Part.ofText("How do I implement authentication in a REST API?")
    ))
);

// Complete request
GenerateContentRequest request = new GenerateContentRequest(
    conversation,
    systemInstruction,
    List.of(docTool, searchTool),
    config
);

GenerateContentResponse response = chatModel.generateContext(request);

// Handle response with thoughts (if model supports thinking)
String mainText = GenerateContentResponseHandler.getText(response);
String thoughts = GenerateContentResponseHandler.getThoughts(response);

if (thoughts != null && !thoughts.isEmpty()) {
    System.out.println("Model's reasoning: " + thoughts);
}
System.out.println("Response: " + mainText);

Finish Reason Handling

Common Finish Reasons

  • STOP: Normal completion - model finished its response naturally
  • MAX_TOKENS: Token limit reached - consider increasing maxOutputTokens
  • SAFETY: Safety filter triggered - review content guidelines
  • RECITATION: Detected verbatim content from training data
  • MALFORMED_FUNCTION_CALL: Function call syntax error

Handling Incomplete Responses

GenerateContentResponse response = chatModel.generateContext(request);
GenerateContentResponse.Candidate candidate = response.candidates().get(0);

if (candidate.finishReason() == FinishReason.MAX_TOKENS) {
    // Response was cut off, might be incomplete
    String partialText = candidate.content().parts().get(0).text();

    // Option 1: Increase token limit and retry
    GenerationConfig newConfig = GenerationConfig.builder()
        .maxOutputTokens(4096)  // increased from default
        .build();

    // Option 2: Ask model to continue
    List<Content> continuation = new ArrayList<>(request.contents());
    continuation.add(candidate.content());  // Add partial response
    continuation.add(new Content("user", List.of(
        Content.Part.ofText("Please continue.")
    )));

    GenerateContentRequest continueRequest = new GenerateContentRequest(
        continuation,
        request.systemInstruction(),
        request.tools(),
        newConfig
    );
}

Task Type Best Practices

Retrieval/Search Use Cases

// Use RETRIEVAL_QUERY for search queries
EmbedContentRequest queryEmbed = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("python pandas tutorial"))),
    EmbedContentRequest.TaskType.RETRIEVAL_QUERY,
    null,
    null
);

// Use RETRIEVAL_DOCUMENT for documents to be searched
EmbedContentRequest docEmbed = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText(documentText))),
    EmbedContentRequest.TaskType.RETRIEVAL_DOCUMENT,
    documentTitle,  // Include title for documents
    null
);

Semantic Similarity

// Compare similarity between texts
EmbedContentRequest embed1 = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("The weather is sunny today."))),
    EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
    null,
    null
);

EmbedContentRequest embed2 = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText("It's a beautiful day outside."))),
    EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
    null,
    null
);

// Calculate cosine similarity between embeddings

Classification and Clustering

// For training a classifier
EmbedContentRequest classifyEmbed = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText(textSample))),
    EmbedContentRequest.TaskType.CLASSIFICATION,
    null,
    null
);

// For grouping similar documents
EmbedContentRequest clusterEmbed = new EmbedContentRequest(
    "text-embedding-004",
    new Content(null, List.of(Content.Part.ofText(textSample))),
    EmbedContentRequest.TaskType.CLUSTERING,
    null,
    null
);

Token Usage Monitoring

GenerateContentResponse response = chatModel.generateContext(request);
UsageMetadata usage = response.usageMetadata();

// Track token consumption
System.out.println("Prompt tokens: " + usage.promptTokenCount());
System.out.println("Response tokens: " + usage.candidatesTokenCount());
System.out.println("Total tokens: " + usage.totalTokenCount());

// Cost calculation (example rates)
double promptCost = usage.promptTokenCount() * 0.00025 / 1000;  // $0.25 per 1M tokens
double responseCost = usage.candidatesTokenCount() * 0.001 / 1000;  // $1.00 per 1M tokens
double totalCost = promptCost + responseCost;

System.out.printf("Estimated cost: $%.6f%n", totalCost);

// Budget checking
int maxTokensAllowed = 10000;
if (usage.totalTokenCount() > maxTokensAllowed) {
    System.out.println("Warning: Token usage exceeded budget");
}

Integration Points

These request/response types integrate with:

  • Chat Models: GeminiChatLanguageModel uses GenerateContentRequest/Response
  • Streaming Models: GeminiStreamingChatLanguageModel uses the same types with SSE
  • Embedding Models: GeminiEmbeddingModel uses EmbedContent types
  • ContentMapper: Converts LangChain4j types to GenerateContentRequest
  • GenerateContentResponseHandler: Extracts data from GenerateContentResponse
  • FinishReasonMapper: Maps FinishReason to LangChain4j types

Best Practices

  1. System Instructions: Use system instructions for persistent behavioral guidance rather than user messages
  2. Token Limits: Set appropriate maxOutputTokens based on use case to control costs and latency
  3. Temperature Control: Use lower temperature (0.0-0.3) for factual tasks, higher (0.7-1.0) for creative tasks
  4. Finish Reason Handling: Always check finishReason to detect truncated or blocked responses
  5. Batch Embeddings: Use batch API for multiple embeddings to reduce overhead
  6. Task Type Selection: Choose appropriate TaskType for embeddings to optimize for your use case
  7. Token Monitoring: Track usageMetadata to monitor costs and optimize prompts
  8. Conversation History: Include relevant history for context but trim old messages to manage token usage
  9. Tool Combinations: Combine function declarations with Google Search for comprehensive capabilities
  10. Error Recovery: Handle different finish reasons gracefully with retry or continuation logic

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-gemini-common@1.7.0

docs

chat-models.md

configuration.md

content-types.md

embedding-models.md

function-calling.md

index.md

requests-responses.md

utilities.md

tile.json