CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-vertex-ai

LangChain4j integration for Google Vertex AI models including chat, language, embedding, image, and scoring capabilities

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

api.mddocs/models/embedding/

Embedding Model API Reference

Class Definition

public class VertexAiEmbeddingModel extends DimensionAwareEmbeddingModel {
    public Response<List<Embedding>> embedAll(List<TextSegment> segments);
    public String modelName();
    public List<Integer> calculateTokensCounts(List<TextSegment> segments);
    public static Builder builder();

    public static final Integer DEFAULT_MAX_SEGMENTS_PER_BATCH = 250;
    public static final Integer DEFAULT_MAX_TOKENS_PER_BATCH = 20000;
}

Methods

/**
 * Embed multiple text segments into vectors.
 *
 * @param segments List of text segments to embed
 * @return Response containing list of embeddings with metadata
 */
public Response<List<Embedding>> embedAll(List<TextSegment> segments);

/**
 * Get the model name being used.
 *
 * @return The model name string
 */
public String modelName();

/**
 * Calculate token counts for text segments.
 *
 * @param segments List of text segments
 * @return List of token counts corresponding to each segment
 */
public List<Integer> calculateTokensCounts(List<TextSegment> segments);

/**
 * Create a new builder for configuring a VertexAiEmbeddingModel.
 *
 * @return A new Builder instance
 */
public static Builder builder();

Builder Methods

Required

public Builder endpoint(String endpoint);        // API endpoint
public Builder project(String project);          // GCP project ID
public Builder location(String location);        // GCP region
public Builder publisher(String publisher);      // Model publisher
public Builder modelName(String modelName);      // Model name/version

Optional

public Builder maxRetries(Integer maxRetries);                          // Default: 2
public Builder maxSegmentsPerBatch(Integer maxSegmentsPerBatch);        // Default: 250
public Builder maxTokensPerBatch(Integer maxTokensPerBatch);            // Default: 20,000
public Builder taskType(TaskType taskType);                             // Task optimization
public Builder titleMetadataKey(String titleMetadataKey);               // Default: "title"
public Builder autoTruncate(Boolean autoTruncate);                      // Default: false
public Builder outputDimensionality(Integer outputDimensionality);      // Custom dimension
public Builder credentials(GoogleCredentials credentials);              // Custom auth
public VertexAiEmbeddingModel build();

TaskType Enum

public enum TaskType {
    RETRIEVAL_QUERY,        // Query for retrieval tasks
    RETRIEVAL_DOCUMENT,     // Document for retrieval tasks
    SEMANTIC_SIMILARITY,    // Semantic similarity comparison
    CLASSIFICATION,         // Text classification
    CLUSTERING,             // Text clustering
    QUESTION_ANSWERING,     // Question answering
    FACT_VERIFICATION,      // Fact verification
    CODE_RETRIEVAL_QUERY    // Code retrieval query
}

VertexAiEmbeddingModelName Enum

public enum VertexAiEmbeddingModelName {
    MULTIMODALEMBEDDING("multimodalembedding", 1408),
    TEXT_EMBEDDING_004("text-embedding-004", 768),
    TEXT_EMBEDDING_PREVIEW_0815("text-embedding-preview-0815", 768),
    TEXT_MULTILINGUAL_EMBEDDING_002("text-multilingual-embedding-002", 768),
    TEXTEMBEDDING_GECKO_MULTILINGUAL_001("textembedding-gecko-multilingual@001", 768),
    TEXTEMBEDDING_GECKO_001("textembedding-gecko@001", 768),
    TEXTEMBEDDING_GECKO_002("textembedding-gecko@002", 768),
    TEXTEMBEDDING_GECKO_003("textembedding-gecko@003", 768);

    public String toString();
    public Integer dimension();
    public static Integer knownDimension(String modelName);
}

Deprecated Constructor

/**
 * @deprecated Since version 1.2.0, use builder() instead
 */
@Deprecated
public VertexAiEmbeddingModel(
    String endpoint,
    String project,
    String location,
    String publisher,
    String modelName,
    Integer maxRetries,
    Integer maxSegmentsPerBatch,
    Integer maxTokensPerBatch,
    TaskType taskType,
    String titleMetadataKey,
    Integer outputDimensionality,
    Boolean autoTruncate
);

Parameter Details

endpoint

  • Required: Yes
  • Format: {region}-aiplatform.googleapis.com:443 (note: different from other models)
  • Examples: us-central1-aiplatform.googleapis.com:443

maxSegmentsPerBatch

  • Type: Integer
  • Default: 250
  • Description: Maximum segments per API call
  • Constant: DEFAULT_MAX_SEGMENTS_PER_BATCH

maxTokensPerBatch

  • Type: Integer
  • Default: 20,000
  • Description: Maximum total tokens per batch
  • Constant: DEFAULT_MAX_TOKENS_PER_BATCH

taskType

  • Type: TaskType enum
  • Required: No
  • Purpose: Optimizes embeddings for specific use cases
  • Usage:
    • RETRIEVAL_DOCUMENT: For indexing documents
    • RETRIEVAL_QUERY: For search queries
    • Others: Specific task optimization

titleMetadataKey

  • Type: String
  • Default: "title"
  • Description: Metadata key for document titles
  • Note: Model uses titles for better embeddings when available

autoTruncate

  • Type: Boolean
  • Default: false
  • Description: Auto-truncate texts exceeding model limits
  • Behavior: If false, throws exception for oversized texts

outputDimensionality

  • Type: Integer
  • Required: No
  • Description: Custom embedding dimension (model-dependent)
  • Note: Not all models support all dimensions

Response Types

Response<List<Embedding>>

public class Response<T> {
    public T content();
    public TokenUsage tokenUsage();
    public FinishReason finishReason();
}

Embedding

public class Embedding {
    public float[] vector();
    public int dimension();
}

Batching Behavior

The model automatically batches requests based on:

  • maxSegmentsPerBatch: Max segments in one API call
  • maxTokensPerBatch: Max total tokens in one batch

If input exceeds limits, automatically splits into multiple API calls.

Error Handling

  • Automatically retries transient failures up to maxRetries times
  • If autoTruncate is true, truncates oversized texts
  • If autoTruncate is false, throws exception for oversized texts

Thread Safety

Thread-safe, can be reused across threads. Builder is not thread-safe.

Performance

  • Reuse model instances
  • Use appropriate taskType for best quality
  • Adjust batch sizes based on text lengths
  • Monitor token counts via calculateTokensCounts()

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-vertex-ai

docs

index.md

quick-reference.md

tile.json