CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-vertex-ai

LangChain4j integration for Google Vertex AI models including chat, language, embedding, image, and scoring capabilities

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

api.mddocs/models/chat/

Chat Model API Reference

Class Definition

public class VertexAiChatModel implements ChatModel {
    public ChatResponse chat(ChatRequest chatRequest);
    public static Builder builder();
}

Methods

chat

/**
 * Generate chat completion from a chat request.
 *
 * @param chatRequest The chat request containing messages and configuration
 * @return ChatResponse containing the generated response
 */
public ChatResponse chat(ChatRequest chatRequest);

builder

/**
 * Create a new builder for configuring a VertexAiChatModel.
 *
 * @return A new Builder instance
 */
public static Builder builder();

Builder Methods

Required Configuration

/**
 * Set the GCP API endpoint URL.
 *
 * @param endpoint The API endpoint (e.g., "https://us-central1-aiplatform.googleapis.com/v1/")
 * @return This builder
 */
public Builder endpoint(String endpoint);

/**
 * Set the Google Cloud Project ID.
 *
 * @param project The GCP project ID
 * @return This builder
 */
public Builder project(String project);

/**
 * Set the GCP region/location.
 *
 * @param location The region (e.g., "us-central1", "europe-west1")
 * @return This builder
 */
public Builder location(String location);

/**
 * Set the model publisher.
 *
 * @param publisher The publisher name, typically "google"
 * @return This builder
 */
public Builder publisher(String publisher);

/**
 * Set the model name/version.
 *
 * @param modelName The model name (e.g., "chat-bison@001", "gemini-pro")
 * @return This builder
 */
public Builder modelName(String modelName);

Optional Configuration

/**
 * Set the sampling temperature for randomness control.
 *
 * @param temperature Temperature value between 0.0 (deterministic) and 1.0 (random)
 * @return This builder
 */
public Builder temperature(Double temperature);

/**
 * Set the maximum number of output tokens.
 *
 * @param maxOutputTokens Maximum tokens in response (default: 200)
 * @return This builder
 */
public Builder maxOutputTokens(Integer maxOutputTokens);

/**
 * Set the top-K sampling parameter.
 *
 * @param topK Number of highest probability tokens to consider
 * @return This builder
 */
public Builder topK(Integer topK);

/**
 * Set the top-P (nucleus) sampling parameter.
 *
 * @param topP Cumulative probability threshold for token selection
 * @return This builder
 */
public Builder topP(Double topP);

/**
 * Set the maximum number of retry attempts on API failures.
 *
 * @param maxRetries Maximum retry attempts (default: 2)
 * @return This builder
 */
public Builder maxRetries(Integer maxRetries);

/**
 * Set custom Google Cloud credentials.
 *
 * @param credentials GoogleCredentials instance for authentication
 * @return This builder
 */
public Builder credentials(GoogleCredentials credentials);

/**
 * Build the VertexAiChatModel instance.
 *
 * @return Configured VertexAiChatModel instance
 */
public VertexAiChatModel build();

Deprecated Constructor

/**
 * @deprecated Since version 1.2.0, use builder() instead
 */
@Deprecated
public VertexAiChatModel(
    String endpoint,
    String project,
    String location,
    String publisher,
    String modelName,
    Double temperature,
    Integer maxOutputTokens,
    Integer topK,
    Double topP,
    Integer maxRetries
);

Parameter Details

endpoint

  • Type: String
  • Required: Yes
  • Format: https://{region}-aiplatform.googleapis.com/v1/
  • Examples:
    • https://us-central1-aiplatform.googleapis.com/v1/
    • https://europe-west1-aiplatform.googleapis.com/v1/
    • https://asia-northeast1-aiplatform.googleapis.com/v1/

project

  • Type: String
  • Required: Yes
  • Description: Google Cloud Project ID
  • Example: my-project-123

location

  • Type: String
  • Required: Yes
  • Description: GCP region where model is hosted
  • Examples: us-central1, europe-west1, asia-northeast1
  • Note: Must match endpoint region

publisher

  • Type: String
  • Required: Yes
  • Value: "google" for Vertex AI models

modelName

  • Type: String
  • Required: Yes
  • Values:
    • chat-bison@001 - PaLM 2 chat model
    • chat-bison@002 - PaLM 2 chat model (updated)
    • gemini-pro - Gemini Pro
    • gemini-ultra - Gemini Ultra

temperature

  • Type: Double
  • Required: No
  • Range: 0.0 to 1.0
  • Default: Varies by model
  • Behavior:
    • 0.0 - Deterministic, focused responses
    • 0.5 - Balanced
    • 1.0 - Creative, varied responses

maxOutputTokens

  • Type: Integer
  • Required: No
  • Default: 200
  • Description: Maximum length of generated response in tokens
  • Note: Higher values allow longer responses but may increase latency

topK

  • Type: Integer
  • Required: No
  • Description: Limits token sampling to top K highest probability tokens
  • Behavior:
    • Lower values (e.g., 10) - More focused responses
    • Higher values (e.g., 40) - More diverse responses

topP

  • Type: Double
  • Required: No
  • Range: 0.0 to 1.0
  • Description: Nucleus sampling threshold
  • Behavior:
    • Lower values (e.g., 0.8) - More focused responses
    • Higher values (e.g., 0.95) - More diverse responses

maxRetries

  • Type: Integer
  • Required: No
  • Default: 2
  • Description: Number of retry attempts for failed API calls
  • Retries: Transient errors (network issues, rate limits)
  • No retry: Non-retryable errors (invalid params, auth errors)

credentials

  • Type: GoogleCredentials
  • Required: No
  • Default: Uses Application Default Credentials
  • Description: Custom Google Cloud credentials for authentication

Response Types

ChatResponse

public class ChatResponse {
    public AiMessage aiMessage();
    public TokenUsage tokenUsage();
    public FinishReason finishReason();
}

AiMessage

public class AiMessage {
    public String text();
}

TokenUsage

public class TokenUsage {
    public Integer inputTokenCount();
    public Integer outputTokenCount();
    public Integer totalTokenCount();
}

FinishReason

public enum FinishReason {
    STOP,           // Natural completion
    LENGTH,         // Max tokens reached
    SAFETY,         // Safety filter triggered
    RECITATION,     // Recitation detected
    OTHER           // Other reason
}

Request Types

ChatRequest

public class ChatRequest {
    public static Builder builder();
}

ChatRequest.Builder

public Builder messages(ChatMessage... messages);
public Builder messages(List<ChatMessage> messages);
public ChatRequest build();

ChatMessage Types

// User message
public class UserMessage implements ChatMessage {
    public static UserMessage from(String text);
}

// AI message
public class AiMessage implements ChatMessage {
    public static AiMessage from(String text);
}

// System message
public class SystemMessage implements ChatMessage {
    public static SystemMessage from(String text);
}

Error Handling

Automatic Retry

The model automatically retries these errors up to maxRetries times:

  • Network timeouts
  • Rate limiting (429 errors)
  • Server errors (5xx)

Non-Retryable Errors

These errors are not retried and throw exceptions immediately:

  • Invalid parameters (400 errors)
  • Authentication errors (401/403 errors)
  • Malformed requests

Exception Types

try {
    ChatResponse response = model.chat(request);
} catch (RuntimeException e) {
    // Catches all non-retryable errors after retries exhausted
}

Thread Safety

VertexAiChatModel instances are thread-safe and can be reused across multiple threads. Builder is not thread-safe.

Performance Considerations

  • Reuse model instances: Creating models is expensive (authentication, connection setup)
  • Batch conversations: Send multiple turns in one request when possible
  • Adjust maxOutputTokens: Lower values reduce latency and cost
  • Monitor token usage: Use response.tokenUsage() to track consumption
  • Set appropriate maxRetries: Balance resilience vs latency

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-vertex-ai@1.11.0

docs

index.md

quick-reference.md

tile.json