tessl/maven-dev-langchain4j--langchain4j-vertex-ai

LangChain4j integration for Google Vertex AI models including chat, language, embedding, image, and scoring capabilities

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

Chat Model API Reference

Name: tessl/maven-dev-langchain4j--langchain4j-vertex-ai
Author: tessl

Class Definition

public class VertexAiChatModel implements ChatModel {
    public ChatResponse chat(ChatRequest chatRequest);
    public static Builder builder();
}

Methods

chat

/**
 * Generate chat completion from a chat request.
 *
 * @param chatRequest The chat request containing messages and configuration
 * @return ChatResponse containing the generated response
 */
public ChatResponse chat(ChatRequest chatRequest);

builder

/**
 * Create a new builder for configuring a VertexAiChatModel.
 *
 * @return A new Builder instance
 */
public static Builder builder();

Builder Methods

Required Configuration

/**
 * Set the GCP API endpoint URL.
 *
 * @param endpoint The API endpoint (e.g., "https://us-central1-aiplatform.googleapis.com/v1/")
 * @return This builder
 */
public Builder endpoint(String endpoint);

/**
 * Set the Google Cloud Project ID.
 *
 * @param project The GCP project ID
 * @return This builder
 */
public Builder project(String project);

/**
 * Set the GCP region/location.
 *
 * @param location The region (e.g., "us-central1", "europe-west1")
 * @return This builder
 */
public Builder location(String location);

/**
 * Set the model publisher.
 *
 * @param publisher The publisher name, typically "google"
 * @return This builder
 */
public Builder publisher(String publisher);

/**
 * Set the model name/version.
 *
 * @param modelName The model name (e.g., "chat-bison@001", "gemini-pro")
 * @return This builder
 */
public Builder modelName(String modelName);

Optional Configuration

/**
 * Set the sampling temperature for randomness control.
 *
 * @param temperature Temperature value between 0.0 (deterministic) and 1.0 (random)
 * @return This builder
 */
public Builder temperature(Double temperature);

/**
 * Set the maximum number of output tokens.
 *
 * @param maxOutputTokens Maximum tokens in response (default: 200)
 * @return This builder
 */
public Builder maxOutputTokens(Integer maxOutputTokens);

/**
 * Set the top-K sampling parameter.
 *
 * @param topK Number of highest probability tokens to consider
 * @return This builder
 */
public Builder topK(Integer topK);

/**
 * Set the top-P (nucleus) sampling parameter.
 *
 * @param topP Cumulative probability threshold for token selection
 * @return This builder
 */
public Builder topP(Double topP);

/**
 * Set the maximum number of retry attempts on API failures.
 *
 * @param maxRetries Maximum retry attempts (default: 2)
 * @return This builder
 */
public Builder maxRetries(Integer maxRetries);

/**
 * Set custom Google Cloud credentials.
 *
 * @param credentials GoogleCredentials instance for authentication
 * @return This builder
 */
public Builder credentials(GoogleCredentials credentials);

/**
 * Build the VertexAiChatModel instance.
 *
 * @return Configured VertexAiChatModel instance
 */
public VertexAiChatModel build();

Deprecated Constructor

/**
 * @deprecated Since version 1.2.0, use builder() instead
 */
@Deprecated
public VertexAiChatModel(
    String endpoint,
    String project,
    String location,
    String publisher,
    String modelName,
    Double temperature,
    Integer maxOutputTokens,
    Integer topK,
    Double topP,
    Integer maxRetries
);

Parameter Details

endpoint

Type: String
Required: Yes
Format: https://{region}-aiplatform.googleapis.com/v1/
Examples:
- https://us-central1-aiplatform.googleapis.com/v1/
- https://europe-west1-aiplatform.googleapis.com/v1/
- https://asia-northeast1-aiplatform.googleapis.com/v1/

project

Type: String
Required: Yes
Description: Google Cloud Project ID
Example: my-project-123

location

Type: String
Required: Yes
Description: GCP region where model is hosted
Examples: us-central1, europe-west1, asia-northeast1
Note: Must match endpoint region

publisher

Type: String
Required: Yes
Value: "google" for Vertex AI models

modelName

Type: String
Required: Yes
Values:
- chat-bison@001 - PaLM 2 chat model
- chat-bison@002 - PaLM 2 chat model (updated)
- gemini-pro - Gemini Pro
- gemini-ultra - Gemini Ultra

temperature

Type: Double
Required: No
Range: 0.0 to 1.0
Default: Varies by model
Behavior:
- 0.0 - Deterministic, focused responses
- 0.5 - Balanced
- 1.0 - Creative, varied responses

maxOutputTokens

Type: Integer
Required: No
Default: 200
Description: Maximum length of generated response in tokens
Note: Higher values allow longer responses but may increase latency

topK

Type: Integer
Required: No
Description: Limits token sampling to top K highest probability tokens
Behavior:
- Lower values (e.g., 10) - More focused responses
- Higher values (e.g., 40) - More diverse responses

topP

Type: Double
Required: No
Range: 0.0 to 1.0
Description: Nucleus sampling threshold
Behavior:
- Lower values (e.g., 0.8) - More focused responses
- Higher values (e.g., 0.95) - More diverse responses

maxRetries

Type: Integer
Required: No
Default: 2
Description: Number of retry attempts for failed API calls
Retries: Transient errors (network issues, rate limits)
No retry: Non-retryable errors (invalid params, auth errors)

credentials

Type: GoogleCredentials
Required: No
Default: Uses Application Default Credentials
Description: Custom Google Cloud credentials for authentication

Response Types

ChatResponse

public class ChatResponse {
    public AiMessage aiMessage();
    public TokenUsage tokenUsage();
    public FinishReason finishReason();
}

AiMessage

public class AiMessage {
    public String text();
}

TokenUsage

public class TokenUsage {
    public Integer inputTokenCount();
    public Integer outputTokenCount();
    public Integer totalTokenCount();
}

FinishReason

public enum FinishReason {
    STOP,           // Natural completion
    LENGTH,         // Max tokens reached
    SAFETY,         // Safety filter triggered
    RECITATION,     // Recitation detected
    OTHER           // Other reason
}

Request Types

ChatRequest

public class ChatRequest {
    public static Builder builder();
}

ChatRequest.Builder

public Builder messages(ChatMessage... messages);
public Builder messages(List<ChatMessage> messages);
public ChatRequest build();

ChatMessage Types

// User message
public class UserMessage implements ChatMessage {
    public static UserMessage from(String text);
}

// AI message
public class AiMessage implements ChatMessage {
    public static AiMessage from(String text);
}

// System message
public class SystemMessage implements ChatMessage {
    public static SystemMessage from(String text);
}

Error Handling

Automatic Retry

The model automatically retries these errors up to maxRetries times:

Network timeouts
Rate limiting (429 errors)
Server errors (5xx)

Non-Retryable Errors

These errors are not retried and throw exceptions immediately:

Invalid parameters (400 errors)
Authentication errors (401/403 errors)
Malformed requests

Exception Types

try {
    ChatResponse response = model.chat(request);
} catch (RuntimeException e) {
    // Catches all non-retryable errors after retries exhausted
}

Thread Safety

VertexAiChatModel instances are thread-safe and can be reused across multiple threads. Builder is not thread-safe.

Performance Considerations

Reuse model instances: Creating models is expensive (authentication, connection setup)
Batch conversations: Send multiple turns in one request when possible
Adjust maxOutputTokens: Lower values reduce latency and cost
Monitor token usage: Use response.tokenUsage() to track consumption
Set appropriate maxRetries: Balance resilience vs latency

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-vertex-ai@1.11.0

tessl/maven-dev-langchain4j--langchain4j-vertex-ai

api.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/models/chat/

Chat Model API Reference

Class Definition

Methods

chat

builder

Builder Methods

Required Configuration

Optional Configuration

Deprecated Constructor

Parameter Details

endpoint

project

location

publisher

modelName

temperature

maxOutputTokens

topK

topP

maxRetries

credentials

Response Types

ChatResponse

AiMessage

TokenUsage

FinishReason

Request Types

ChatRequest

ChatRequest.Builder

ChatMessage Types

Error Handling

Automatic Retry

Non-Retryable Errors

Exception Types

Thread Safety

Performance Considerations

api.mddocs/models/chat/