Common shared infrastructure for integrating Google Gemini AI models with Quarkus applications through the LangChain4j framework, providing base chat model functionality, schema mapping, and embedding model support.
Core request and response types for the Gemini API. These types define the structure of API calls for content generation and embeddings, including configuration options, system instructions, tool definitions, and response metadata.
The primary request type for generating content from Gemini models. Contains conversation history, system instructions, available tools, and generation configuration.
/**
* Request for generating content from the model.
*
* @param contents List of conversation messages (user and model turns)
* @param systemInstruction Optional system-level instructions for the model
* @param tools Optional list of tools (functions, Google Search) available to the model
* @param generationConfig Optional configuration for generation parameters
*/
public record GenerateContentRequest(
List<Content> contents,
SystemInstruction systemInstruction,
List<Tool> tools,
GenerationConfig generationConfig
);System-level instructions that guide the model's behavior throughout the conversation. Unlike regular content messages, system instructions are not part of the conversational back-and-forth.
/**
* System instruction for the model.
*
* @param parts List of instruction parts (typically text)
*/
public record SystemInstruction(List<Part> parts) {
/**
* Creates a system instruction from text strings.
*
* @param instructions List of instruction strings
* @return SystemInstruction with text parts
*/
public static SystemInstruction ofContent(List<String> instructions);
}
/**
* Individual part of a system instruction.
*
* @param text The instruction text
*/
public record Part(String text);Represents tools available to the model, including function declarations and Google Search capabilities.
/**
* Tool definition for the model.
*
* @param functionDeclarations List of function declarations available to call
* @param googleSearch Marker for enabling standard Google Search
* @param googleSearchRetrieval Marker for enabling Google Search retrieval
*/
public record Tool(
List<FunctionDeclaration> functionDeclarations,
GoogleSearch googleSearch,
GoogleSearchRetrieval googleSearchRetrieval
) {
/**
* Creates a tool from function declarations.
*
* @param declarations List of available functions
* @return Tool with function declarations
*/
public static Tool ofFunctionDeclarations(List<FunctionDeclaration> declarations);
/**
* Creates a Google Search tool.
*
* @return Tool with Google Search enabled
*/
public static Tool ofGoogleSearch();
/**
* Creates a Google Search retrieval tool.
*
* @return Tool with Google Search retrieval enabled
*/
public static Tool ofGoogleSearchRetrieval();
}
/**
* Marker record for Google Search capability.
*/
public record GoogleSearch();
/**
* Marker record for Google Search retrieval capability.
*/
public record GoogleSearchRetrieval();Response from the model containing generated candidates, token usage metadata, and model version information.
/**
* Response from content generation.
*
* @param candidates List of generated response candidates
* @param usageMetadata Token usage information
* @param modelVersion Version of the model that generated the response
* @param responseId Unique identifier for this response
*/
public record GenerateContentResponse(
List<Candidate> candidates,
UsageMetadata usageMetadata,
String modelVersion,
String responseId
);A single response candidate containing the generated content and metadata about why generation stopped.
/**
* A response candidate from the model.
*
* @param content The generated content
* @param finishReason The reason generation stopped
*/
public record Candidate(Content content, FinishReason finishReason);
/**
* Content within a candidate.
*
* @param parts List of content parts
*/
public record Content(List<Part> parts);
/**
* Individual part within candidate content.
*
* @param text Generated text content
* @param functionCall Function call requested by the model
* @param thought Whether this part is a thought/reasoning (true) or regular content (false)
* @param thoughtSignature Signature for the thought
*/
public record Part(
String text,
FunctionCall functionCall,
Boolean thought,
String thoughtSignature
);Token usage information for a generation request and response.
/**
* Token usage metadata for the request and response.
*
* @param promptTokenCount Number of tokens in the prompt
* @param candidatesTokenCount Number of tokens in all candidates
* @param totalTokenCount Total tokens (prompt + candidates)
*/
public record UsageMetadata(
int promptTokenCount,
int candidatesTokenCount,
int totalTokenCount
);Enumeration of reasons why the model stopped generating content.
/**
* Reason why the model stopped generating content.
*/
public enum FinishReason {
/** Finish reason not specified */
FINISH_REASON_UNSPECIFIED,
/** Natural stopping point (model completed its response) */
STOP,
/** Reached maximum token limit */
MAX_TOKENS,
/** Stopped due to safety filters */
SAFETY,
/** Stopped due to recitation detection */
RECITATION,
/** Stopped due to language issues */
LANGUAGE,
/** Stopped for other reasons */
OTHER,
/** Stopped due to blocklist match */
BLOCKLIST,
/** Stopped due to prohibited content */
PROHIBITED_CONTENT,
/** Stopped due to sensitive personally identifiable information */
SPII,
/** Stopped due to malformed function call */
MALFORMED_FUNCTION_CALL,
/** Stopped due to image safety concerns */
IMAGE_SAFETY,
/** Stopped due to unexpected tool call */
UNEXPECTED_TOOL_CALL,
/** Stopped due to too many tool calls */
TOO_MANY_TOOL_CALLS
}Request for generating embeddings from text content.
/**
* Request for embedding content.
*
* @param model Model identifier (e.g., "text-embedding-004")
* @param content Content to embed
* @param taskType Type of embedding task
* @param title Optional title for document embeddings
* @param outputDimensionality Optional output dimension size
*/
public record EmbedContentRequest(
String model,
Content content,
TaskType taskType,
String title,
Integer outputDimensionality
);Enumeration of embedding task types that optimize the embedding for specific use cases.
/**
* Type of embedding task.
*/
public enum TaskType {
/** Task type not specified */
TASK_TYPE_UNSPECIFIED,
/** Embedding will be used for search queries */
RETRIEVAL_QUERY,
/** Embedding will be used for documents in a search corpus */
RETRIEVAL_DOCUMENT,
/** Embedding will be used for semantic similarity comparison */
SEMANTIC_SIMILARITY,
/** Embedding will be used for classification tasks */
CLASSIFICATION,
/** Embedding will be used for clustering tasks */
CLUSTERING,
/** Embedding will be used for question answering */
QUESTION_ANSWERING,
/** Embedding will be used for fact verification */
FACT_VERIFICATION
}Response containing the generated embedding vector.
/**
* Response from embedding content.
*
* @param embedding The generated embedding
*/
public record EmbedContentResponse(Embedding embedding);
/**
* Embedding vector.
*
* @param values Array of floating point values representing the embedding
*/
public record Embedding(float[] values);Batch request for embedding multiple pieces of content in a single API call.
/**
* Batch request for embedding multiple contents.
*
* @param requests List of individual embedding requests
*/
public record EmbedContentRequests(List<EmbedContentRequest> requests);Batch response containing embeddings for multiple pieces of content.
/**
* Batch response with multiple embeddings.
*
* @param embeddings List of embeddings corresponding to the batch requests
*/
public record EmbedContentResponses(List<Embedding> embeddings);// Simple text-only request
GenerateContentRequest request = new GenerateContentRequest(
List.of(
new Content("user", List.of(Content.Part.ofText("What is quantum computing?")))
),
null, // no system instruction
null, // no tools
null // default generation config
);
GenerateContentResponse response = chatModel.generateContext(request);
// Extract the text response
String text = GenerateContentResponseHandler.getText(response);
System.out.println(text);
// Check token usage
UsageMetadata usage = response.usageMetadata();
System.out.println("Tokens used: " + usage.totalTokenCount());// Create system instruction
SystemInstruction systemInstruction = GenerateContentRequest.SystemInstruction.ofContent(
List.of(
"You are a helpful astronomy expert.",
"Always provide accurate scientific information.",
"Use analogies to make complex concepts accessible."
)
);
// Create request with system instruction
GenerateContentRequest request = new GenerateContentRequest(
List.of(
new Content("user", List.of(Content.Part.ofText("Explain black holes.")))
),
systemInstruction,
null,
null
);
GenerateContentResponse response = chatModel.generateContext(request);// Configure generation parameters
GenerationConfig config = GenerationConfig.builder()
.temperature(0.7)
.maxOutputTokens(1024)
.topK(40)
.topP(0.95)
.stopSequences(List.of("\n\n", "END"))
.build();
GenerateContentRequest request = new GenerateContentRequest(
List.of(
new Content("user", List.of(Content.Part.ofText("Write a haiku about coding.")))
),
null,
null,
config
);
GenerateContentResponse response = chatModel.generateContext(request);// Declare a function
FunctionDeclaration getWeather = new FunctionDeclaration(
"get_weather",
"Get current weather for a location",
FunctionDeclaration.Parameters.objectType(
Map.of(
"location", Map.of("type", "string", "description", "City name")
),
List.of("location")
)
);
// Create tool with function
GenerateContentRequest.Tool weatherTool =
GenerateContentRequest.Tool.ofFunctionDeclarations(List.of(getWeather));
// Request with tool
GenerateContentRequest request = new GenerateContentRequest(
List.of(
new Content("user", List.of(Content.Part.ofText("What's the weather in Tokyo?")))
),
null,
List.of(weatherTool),
null
);
GenerateContentResponse response = chatModel.generateContext(request);
// Check if model wants to call the function
if (!response.candidates().isEmpty()) {
GenerateContentResponse.Candidate candidate = response.candidates().get(0);
if (candidate.finishReason() == FinishReason.STOP) {
// Normal text response
String text = candidate.content().parts().get(0).text();
System.out.println(text);
} else {
// Check for function call
for (GenerateContentResponse.Candidate.Part part : candidate.content().parts()) {
if (part.functionCall() != null) {
FunctionCall call = part.functionCall();
System.out.println("Function: " + call.name());
System.out.println("Args: " + call.args());
// Execute function and send response back
// (See function-calling.md for complete flow)
}
}
}
}// Enable Google Search for real-time information
GenerateContentRequest.Tool searchTool = GenerateContentRequest.Tool.ofGoogleSearch();
GenerateContentRequest request = new GenerateContentRequest(
List.of(
new Content("user", List.of(
Content.Part.ofText("What are the latest developments in fusion energy?")
))
),
null,
List.of(searchTool),
null
);
GenerateContentResponse response = chatModel.generateContext(request);
String answer = GenerateContentResponseHandler.getText(response);List<Content> conversation = new ArrayList<>();
// First turn
conversation.add(new Content(
"user",
List.of(Content.Part.ofText("I'm learning Java. Where should I start?"))
));
GenerateContentRequest request1 = new GenerateContentRequest(conversation, null, null, null);
GenerateContentResponse response1 = chatModel.generateContext(request1);
// Add model's response to conversation
String modelReply1 = response1.candidates().get(0).content().parts().get(0).text();
conversation.add(new Content(
"model",
List.of(Content.Part.ofText(modelReply1))
));
// Second turn
conversation.add(new Content(
"user",
List.of(Content.Part.ofText("Can you explain object-oriented programming?"))
));
GenerateContentRequest request2 = new GenerateContentRequest(conversation, null, null, null);
GenerateContentResponse response2 = chatModel.generateContext(request2);GenerateContentResponse response = chatModel.generateContext(request);
for (GenerateContentResponse.Candidate candidate : response.candidates()) {
switch (candidate.finishReason()) {
case STOP:
// Normal completion
String text = candidate.content().parts().get(0).text();
System.out.println("Response: " + text);
break;
case MAX_TOKENS:
// Response truncated due to token limit
System.out.println("Response truncated. Consider increasing maxOutputTokens.");
break;
case SAFETY:
// Blocked by safety filters
System.out.println("Response blocked by safety filters.");
break;
case MALFORMED_FUNCTION_CALL:
// Function calling error
System.out.println("Model made a malformed function call.");
break;
default:
System.out.println("Generation stopped: " + candidate.finishReason());
}
}// Create embedding request
EmbedContentRequest embedRequest = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("Machine learning is fascinating."))),
EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
null, // no title
null // default dimensionality
);
EmbedContentResponse embedResponse = embeddingModel.embedContent(embedRequest);
// Access embedding vector
float[] vector = embedResponse.embedding().values();
System.out.println("Embedding dimension: " + vector.length);// For search query
EmbedContentRequest queryRequest = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("best restaurants in Paris"))),
EmbedContentRequest.TaskType.RETRIEVAL_QUERY,
null,
null
);
// For document in search corpus
EmbedContentRequest docRequest = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText(
"Le Bernardin is a acclaimed French seafood restaurant..."
))),
EmbedContentRequest.TaskType.RETRIEVAL_DOCUMENT,
"Le Bernardin Restaurant Review", // title for document
null
);// Create multiple embedding requests
List<EmbedContentRequest> requests = List.of(
new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("First document text"))),
EmbedContentRequest.TaskType.CLUSTERING,
null,
null
),
new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("Second document text"))),
EmbedContentRequest.TaskType.CLUSTERING,
null,
null
),
new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("Third document text"))),
EmbedContentRequest.TaskType.CLUSTERING,
null,
null
)
);
// Batch request
EmbedContentRequests batchRequest = new EmbedContentRequests(requests);
EmbedContentResponses batchResponse = embeddingModel.batchEmbedContents(batchRequest);
// Process all embeddings
for (int i = 0; i < batchResponse.embeddings().size(); i++) {
float[] vector = batchResponse.embeddings().get(i).values();
System.out.println("Embedding " + i + ": " + vector.length + " dimensions");
}// Request smaller embedding dimension for efficiency
EmbedContentRequest request = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("Sample text for embedding"))),
EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
null,
256 // reduce from default 768 to 256 dimensions
);
EmbedContentResponse response = embeddingModel.embedContent(request);
System.out.println("Dimension: " + response.embedding().values().length);// System instruction
SystemInstruction systemInstruction = GenerateContentRequest.SystemInstruction.ofContent(
List.of("You are a helpful assistant specialized in software development.")
);
// Tools
FunctionDeclaration searchDocs = new FunctionDeclaration(
"search_documentation",
"Search technical documentation",
FunctionDeclaration.Parameters.objectType(
Map.of("query", Map.of("type", "string")),
List.of("query")
)
);
GenerateContentRequest.Tool docTool =
GenerateContentRequest.Tool.ofFunctionDeclarations(List.of(searchDocs));
GenerateContentRequest.Tool searchTool = GenerateContentRequest.Tool.ofGoogleSearch();
// Generation config
GenerationConfig config = GenerationConfig.builder()
.temperature(0.7)
.maxOutputTokens(2048)
.topK(40)
.topP(0.95)
.thinkingConfig(new ThinkingConfig(1000L, true))
.build();
// Conversation
List<Content> conversation = List.of(
new Content("user", List.of(
Content.Part.ofText("How do I implement authentication in a REST API?")
))
);
// Complete request
GenerateContentRequest request = new GenerateContentRequest(
conversation,
systemInstruction,
List.of(docTool, searchTool),
config
);
GenerateContentResponse response = chatModel.generateContext(request);
// Handle response with thoughts (if model supports thinking)
String mainText = GenerateContentResponseHandler.getText(response);
String thoughts = GenerateContentResponseHandler.getThoughts(response);
if (thoughts != null && !thoughts.isEmpty()) {
System.out.println("Model's reasoning: " + thoughts);
}
System.out.println("Response: " + mainText);maxOutputTokensGenerateContentResponse response = chatModel.generateContext(request);
GenerateContentResponse.Candidate candidate = response.candidates().get(0);
if (candidate.finishReason() == FinishReason.MAX_TOKENS) {
// Response was cut off, might be incomplete
String partialText = candidate.content().parts().get(0).text();
// Option 1: Increase token limit and retry
GenerationConfig newConfig = GenerationConfig.builder()
.maxOutputTokens(4096) // increased from default
.build();
// Option 2: Ask model to continue
List<Content> continuation = new ArrayList<>(request.contents());
continuation.add(candidate.content()); // Add partial response
continuation.add(new Content("user", List.of(
Content.Part.ofText("Please continue.")
)));
GenerateContentRequest continueRequest = new GenerateContentRequest(
continuation,
request.systemInstruction(),
request.tools(),
newConfig
);
}// Use RETRIEVAL_QUERY for search queries
EmbedContentRequest queryEmbed = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("python pandas tutorial"))),
EmbedContentRequest.TaskType.RETRIEVAL_QUERY,
null,
null
);
// Use RETRIEVAL_DOCUMENT for documents to be searched
EmbedContentRequest docEmbed = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText(documentText))),
EmbedContentRequest.TaskType.RETRIEVAL_DOCUMENT,
documentTitle, // Include title for documents
null
);// Compare similarity between texts
EmbedContentRequest embed1 = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("The weather is sunny today."))),
EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
null,
null
);
EmbedContentRequest embed2 = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText("It's a beautiful day outside."))),
EmbedContentRequest.TaskType.SEMANTIC_SIMILARITY,
null,
null
);
// Calculate cosine similarity between embeddings// For training a classifier
EmbedContentRequest classifyEmbed = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText(textSample))),
EmbedContentRequest.TaskType.CLASSIFICATION,
null,
null
);
// For grouping similar documents
EmbedContentRequest clusterEmbed = new EmbedContentRequest(
"text-embedding-004",
new Content(null, List.of(Content.Part.ofText(textSample))),
EmbedContentRequest.TaskType.CLUSTERING,
null,
null
);GenerateContentResponse response = chatModel.generateContext(request);
UsageMetadata usage = response.usageMetadata();
// Track token consumption
System.out.println("Prompt tokens: " + usage.promptTokenCount());
System.out.println("Response tokens: " + usage.candidatesTokenCount());
System.out.println("Total tokens: " + usage.totalTokenCount());
// Cost calculation (example rates)
double promptCost = usage.promptTokenCount() * 0.00025 / 1000; // $0.25 per 1M tokens
double responseCost = usage.candidatesTokenCount() * 0.001 / 1000; // $1.00 per 1M tokens
double totalCost = promptCost + responseCost;
System.out.printf("Estimated cost: $%.6f%n", totalCost);
// Budget checking
int maxTokensAllowed = 10000;
if (usage.totalTokenCount() > maxTokensAllowed) {
System.out.println("Warning: Token usage exceeded budget");
}These request/response types integrate with:
maxOutputTokens based on use case to control costs and latencyfinishReason to detect truncated or blocked responsesTaskType for embeddings to optimize for your use caseusageMetadata to monitor costs and optimize promptsInstall with Tessl CLI
npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-gemini-common@1.7.0