CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-vertex-ai

LangChain4j integration for Google Vertex AI models including chat, language, embedding, image, and scoring capabilities

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

api.mddocs/models/language/

Language Model API Reference

Class Definition

public class VertexAiLanguageModel implements LanguageModel {
    public Response<String> generate(String prompt);
    public static Builder builder();
}

Methods

generate

/**
 * Generate text completion from a prompt.
 *
 * @param prompt The input text prompt
 * @return Response containing generated text and metadata
 */
public Response<String> generate(String prompt);

builder

/**
 * Create a new builder for configuring a VertexAiLanguageModel.
 *
 * @return A new Builder instance
 */
public static Builder builder();

Constructor

/**
 * Create a VertexAiLanguageModel with explicit parameters.
 *
 * @param endpoint GCP API endpoint URL
 * @param project Google Cloud Project ID
 * @param location GCP region
 * @param publisher Model publisher (typically "google")
 * @param modelName Model name/version
 * @param temperature Sampling temperature (0.0-1.0)
 * @param maxOutputTokens Maximum output tokens
 * @param topK Top-K sampling parameter
 * @param topP Top-P nucleus sampling parameter
 * @param maxRetries Maximum retry attempts
 */
public VertexAiLanguageModel(
    String endpoint,
    String project,
    String location,
    String publisher,
    String modelName,
    Double temperature,
    Integer maxOutputTokens,
    Integer topK,
    Double topP,
    Integer maxRetries
);

Builder Methods

Required Configuration

/**
 * Set the GCP API endpoint URL.
 *
 * @param endpoint The API endpoint (e.g., "https://us-central1-aiplatform.googleapis.com/v1/")
 * @return This builder
 */
public Builder endpoint(String endpoint);

/**
 * Set the Google Cloud Project ID.
 *
 * @param project The GCP project ID
 * @return This builder
 */
public Builder project(String project);

/**
 * Set the GCP region/location.
 *
 * @param location The region (e.g., "us-central1", "europe-west1")
 * @return This builder
 */
public Builder location(String location);

/**
 * Set the model publisher.
 *
 * @param publisher The publisher name, typically "google"
 * @return This builder
 */
public Builder publisher(String publisher);

/**
 * Set the model name/version.
 *
 * @param modelName The model name (e.g., "text-bison@001", "text-bison@002")
 * @return This builder
 */
public Builder modelName(String modelName);

Optional Configuration

/**
 * Set the sampling temperature for randomness control.
 *
 * @param temperature Temperature value between 0.0 (deterministic) and 1.0 (random)
 * @return This builder
 */
public Builder temperature(Double temperature);

/**
 * Set the maximum number of output tokens.
 *
 * @param maxOutputTokens Maximum tokens in response (default: 200)
 * @return This builder
 */
public Builder maxOutputTokens(Integer maxOutputTokens);

/**
 * Set the top-K sampling parameter.
 *
 * @param topK Number of highest probability tokens to consider
 * @return This builder
 */
public Builder topK(Integer topK);

/**
 * Set the top-P (nucleus) sampling parameter.
 *
 * @param topP Cumulative probability threshold for token selection
 * @return This builder
 */
public Builder topP(Double topP);

/**
 * Set the maximum number of retry attempts on API failures.
 *
 * @param maxRetries Maximum retry attempts (default: 3)
 * @return This builder
 */
public Builder maxRetries(Integer maxRetries);

/**
 * Build the VertexAiLanguageModel instance.
 *
 * @return Configured VertexAiLanguageModel instance
 */
public VertexAiLanguageModel build();

Parameter Details

endpoint

  • Type: String
  • Required: Yes
  • Format: https://{region}-aiplatform.googleapis.com/v1/
  • Examples: See Common Configuration

project

  • Type: String
  • Required: Yes
  • Description: Google Cloud Project ID

location

  • Type: String
  • Required: Yes
  • Values: us-central1, europe-west1, asia-northeast1, etc.

publisher

  • Type: String
  • Required: Yes
  • Value: "google"

modelName

  • Type: String
  • Required: Yes
  • Values:
    • text-bison@001 - PaLM 2 text generation
    • text-bison@002 - PaLM 2 text generation (updated)
    • text-bison-32k - Extended context (32k tokens)

temperature

  • Type: Double
  • Required: No
  • Range: 0.0 to 1.0
  • Default: Varies by model
  • Behavior:
    • 0.0 - Deterministic, focused
    • 0.5 - Balanced
    • 1.0 - Creative, varied

maxOutputTokens

  • Type: Integer
  • Required: No
  • Default: 200
  • Description: Maximum response length in tokens

topK

  • Type: Integer
  • Required: No
  • Description: Top-K sampling parameter
  • Behavior: Lower = focused, Higher = diverse

topP

  • Type: Double
  • Required: No
  • Range: 0.0 to 1.0
  • Description: Nucleus sampling threshold

maxRetries

  • Type: Integer
  • Required: No
  • Default: 3
  • Description: Retry attempts for failed API calls

Response Type

Response<String>

public class Response<T> {
    public T content();
    public TokenUsage tokenUsage();
    public FinishReason finishReason();
}

TokenUsage

public class TokenUsage {
    public Integer inputTokenCount();
    public Integer outputTokenCount();
    public Integer totalTokenCount();
}

FinishReason

public enum FinishReason {
    STOP,           // Natural completion
    LENGTH,         // Max tokens reached
    SAFETY,         // Safety filter triggered
    RECITATION,     // Recitation detected
    OTHER           // Other reason
}

Error Handling

Automatic Retry

Automatically retries these errors up to maxRetries times:

  • Network timeouts
  • Rate limiting (429)
  • Server errors (5xx)

Non-Retryable Errors

Not retried, throw exceptions immediately:

  • Invalid parameters (400)
  • Authentication errors (401/403)
  • Malformed requests

Thread Safety

VertexAiLanguageModel instances are thread-safe and can be reused. Builder is not thread-safe.

Performance

  • Reuse model instances across requests
  • Adjust maxOutputTokens to balance latency and completeness
  • Use lower temperature for faster, more deterministic responses
  • Monitor token usage via response.tokenUsage()

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-vertex-ai

docs

index.md

quick-reference.md

tile.json