CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-pgvector

LangChain4j PGVector integration for PostgreSQL-based vector embedding storage and retrieval

Pending
Overview
Eval results
Files

store-creation.mddocs/

Embedding Store Creation

Create and configure PgVectorEmbeddingStore instances with flexible builder patterns supporting both direct database connections and DataSource integration.

Capabilities

Builder Creation

Create a builder for configuring PgVectorEmbeddingStore with individual connection parameters.

/**
 * Creates a builder for PgVectorEmbeddingStore with individual connection parameters
 * @return PgVectorEmbeddingStoreBuilder instance for configuration
 */
public static PgVectorEmbeddingStoreBuilder builder();

Usage Example:

import dev.langchain4j.store.embedding.pgvector.PgVectorEmbeddingStore;

PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("postgres")
    .user("my_user")
    .password("my_password")
    .table("embeddings")
    .dimension(384)
    .build();

DataSource Builder Creation

Create a builder for configuring PgVectorEmbeddingStore with an existing DataSource (recommended for production).

/**
 * Creates a builder for PgVectorEmbeddingStore with DataSource
 * @return DatasourceBuilder instance for configuration
 */
public static DatasourceBuilder datasourceBuilder();

Usage Example:

import javax.sql.DataSource;
import com.zaxxer.hikari.HikariConfig;
import com.zaxxer.hikari.HikariDataSource;
import dev.langchain4j.store.embedding.pgvector.PgVectorEmbeddingStore;

// Create connection pool
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost:5432/postgres");
config.setUsername("my_user");
config.setPassword("my_password");
config.setMaximumPoolSize(10);
DataSource dataSource = new HikariDataSource(config);

// Build store with DataSource
PgVectorEmbeddingStore store = PgVectorEmbeddingStore.datasourceBuilder()
    .datasource(dataSource)
    .table("embeddings")
    .dimension(384)
    .build();

PgVectorEmbeddingStoreBuilder

Builder class for creating PgVectorEmbeddingStore with individual connection parameters.

Connection Configuration

/**
 * Sets the hostname of the PostgreSQL server
 * @param host Hostname of the PostgreSQL server
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder host(String host);

/**
 * Sets the port number of the PostgreSQL server
 * @param port Port number of the PostgreSQL server
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder port(Integer port);

/**
 * Sets the username for database authentication
 * @param user Username for database authentication
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder user(String user);

/**
 * Sets the password for database authentication
 * @param password Password for database authentication
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder password(String password);

/**
 * Sets the name of the database to connect to
 * @param database Name of the database
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder database(String database);

Table Configuration

/**
 * Sets the table name to store embeddings
 * @param table Name of the database table
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder table(String table);

/**
 * Sets the dimensionality of embedding vectors
 * Must match the dimension of the embedding model being used
 * @param dimension Dimensionality of embedding vectors
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder dimension(Integer dimension);

/**
 * Sets whether to automatically create the embeddings table
 * Default: true
 * @param createTable Whether to create table automatically
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder createTable(Boolean createTable);

/**
 * Sets whether to drop the table before recreating it (useful for tests)
 * Default: false
 * @param dropTableFirst Whether to drop table first
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder dropTableFirst(Boolean dropTableFirst);

Index Configuration

/**
 * Sets whether to use IVFFlat index
 * IVFFlat index divides vectors into lists and searches a subset of those lists
 * Has faster build times and uses less memory than HNSW but lower query performance
 * Default: false
 * @param useIndex Whether to enable IVFFlat index
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder useIndex(Boolean useIndex);

/**
 * Sets the number of lists for the IVFFlat index
 * Required if useIndex is true, must be greater than zero
 * Recommended: sqrt(total_rows) for datasets over 1M rows
 * @param indexListSize Number of lists for IVFFlat index
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder indexListSize(Integer indexListSize);

Search Configuration

/**
 * Sets the search mode to use
 * Default: SearchMode.VECTOR
 * @param searchMode Search mode (VECTOR or HYBRID)
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder searchMode(SearchMode searchMode);

/**
 * Sets the PostgreSQL text search configuration for full-text search
 * Used for determining language-specific parsing and stemming in HYBRID mode
 * Default: "simple"
 * Common values: "simple", "english", "french", "german", etc.
 * @param textSearchConfig PostgreSQL text search configuration name
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder textSearchConfig(String textSearchConfig);

/**
 * Sets the RRF k parameter for hybrid search
 * Used in Reciprocal Rank Fusion: Score = 1/(k + rank_vector) + 1/(k + rank_keyword)
 * Lower values (20-40) emphasize top results more
 * Higher values (80-100) create more balanced rankings
 * Default: 60
 * @param rrfK RRF k parameter, must be greater than zero
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder rrfK(Integer rrfK);

Metadata Configuration

/**
 * Sets the metadata storage configuration
 * Configures how metadata associated with embeddings is stored and indexed
 * Default: COMBINED_JSON mode
 * @param metadataStorageConfig Metadata storage configuration
 * @return Builder instance for chaining
 */
PgVectorEmbeddingStoreBuilder metadataStorageConfig(MetadataStorageConfig metadataStorageConfig);

Build Method

/**
 * Builds the PgVectorEmbeddingStore instance
 * @return Configured PgVectorEmbeddingStore instance
 * @throws IllegalArgumentException if required parameters are missing or invalid
 */
PgVectorEmbeddingStore build();

DatasourceBuilder

Builder class for creating PgVectorEmbeddingStore with an existing DataSource.

DataSource Configuration

/**
 * Sets the DataSource object used for database connections
 * @param datasource The DataSource for connection pooling
 * @return Builder instance for chaining
 */
DatasourceBuilder datasource(DataSource datasource);

Table Configuration

/**
 * Sets the table name to store embeddings
 * @param table Name of the database table
 * @return Builder instance for chaining
 */
DatasourceBuilder table(String table);

/**
 * Sets the dimensionality of embedding vectors
 * Must match the dimension of the embedding model being used
 * @param dimension Dimensionality of embedding vectors
 * @return Builder instance for chaining
 */
DatasourceBuilder dimension(Integer dimension);

/**
 * Sets whether to automatically create the embeddings table
 * Default: true
 * @param createTable Whether to create table automatically
 * @return Builder instance for chaining
 */
DatasourceBuilder createTable(Boolean createTable);

/**
 * Sets whether to drop the table before recreating it (useful for tests)
 * Default: false
 * @param dropTableFirst Whether to drop table first
 * @return Builder instance for chaining
 */
DatasourceBuilder dropTableFirst(Boolean dropTableFirst);

Index Configuration

/**
 * Sets whether to use IVFFlat index
 * Default: false
 * @param useIndex Whether to enable IVFFlat index
 * @return Builder instance for chaining
 */
DatasourceBuilder useIndex(Boolean useIndex);

/**
 * Sets the number of lists for the IVFFlat index
 * Required if useIndex is true, must be greater than zero
 * @param indexListSize Number of lists for IVFFlat index
 * @return Builder instance for chaining
 */
DatasourceBuilder indexListSize(Integer indexListSize);

Search Configuration

/**
 * Sets the search mode to use
 * Default: SearchMode.VECTOR
 * @param searchMode Search mode (VECTOR or HYBRID)
 * @return Builder instance for chaining
 */
DatasourceBuilder searchMode(SearchMode searchMode);

/**
 * Sets the PostgreSQL text search configuration for full-text search
 * Default: "simple"
 * @param textSearchConfig PostgreSQL text search configuration name
 * @return Builder instance for chaining
 */
DatasourceBuilder textSearchConfig(String textSearchConfig);

/**
 * Sets the RRF k parameter for hybrid search
 * Default: 60
 * @param rrfK RRF k parameter, must be greater than zero
 * @return Builder instance for chaining
 */
DatasourceBuilder rrfK(Integer rrfK);

Metadata Configuration

/**
 * Sets the metadata storage configuration
 * Default: COMBINED_JSON mode
 * @param metadataStorageConfig Metadata storage configuration
 * @return Builder instance for chaining
 */
DatasourceBuilder metadataStorageConfig(MetadataStorageConfig metadataStorageConfig);

Build Method

/**
 * Builds the PgVectorEmbeddingStore instance
 * @return Configured PgVectorEmbeddingStore instance
 * @throws IllegalArgumentException if required parameters are missing or invalid
 */
PgVectorEmbeddingStore build();

Configuration Examples

Minimal Configuration

Required parameters only:

PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("postgres")
    .user("my_user")
    .password("my_password")
    .table("embeddings")
    .dimension(384)
    .build();

Production Configuration

With connection pooling and indexing:

// Setup connection pool
HikariConfig config = new HikariConfig();
config.setJdbcUrl("jdbc:postgresql://localhost:5432/postgres");
config.setUsername("my_user");
config.setPassword("my_password");
config.setMaximumPoolSize(10);
DataSource dataSource = new HikariDataSource(config);

// Build store
PgVectorEmbeddingStore store = PgVectorEmbeddingStore.datasourceBuilder()
    .datasource(dataSource)
    .table("embeddings")
    .dimension(384)
    .useIndex(true)
    .indexListSize(100)
    .createTable(true)
    .build();

Hybrid Search Configuration

With full-text search enabled:

PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("postgres")
    .user("my_user")
    .password("my_password")
    .table("embeddings")
    .dimension(384)
    .searchMode(SearchMode.HYBRID)
    .textSearchConfig("english")  // Language-specific text search
    .rrfK(60)  // RRF parameter for balancing vector and keyword rankings
    .build();

Custom Metadata Storage

With JSONB storage for better query performance:

import dev.langchain4j.store.embedding.pgvector.DefaultMetadataStorageConfig;
import dev.langchain4j.store.embedding.pgvector.MetadataStorageMode;
import java.util.Collections;

MetadataStorageConfig metadataConfig = DefaultMetadataStorageConfig.builder()
    .storageMode(MetadataStorageMode.COMBINED_JSONB)
    .columnDefinitions(Collections.singletonList("metadata JSONB NULL"))
    .build();

PgVectorEmbeddingStore store = PgVectorEmbeddingStore.builder()
    .host("localhost")
    .port(5432)
    .database("postgres")
    .user("my_user")
    .password("my_password")
    .table("embeddings")
    .dimension(384)
    .metadataStorageConfig(metadataConfig)
    .build();

Important Notes

  • Required Parameters (PgVectorEmbeddingStoreBuilder): host, port, database, user, password, table, dimension
  • Required Parameters (DatasourceBuilder): datasource, table, dimension
  • Index Requirements: If useIndex is true, indexListSize must be provided and must be greater than zero
  • Dimension: Must match the embedding model's dimension (e.g., 384 for AllMiniLmL6V2, 1536 for OpenAI ada-002)
  • Default Behavior: By default, the table is created automatically if it doesn't exist
  • PGVector Extension: The PGVector extension is automatically created if it doesn't exist
  • Connection Pooling: Using DataSource with a connection pool (like HikariCP) is strongly recommended for production

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-pgvector

docs

embedding-operations.md

index.md

metadata-storage.md

search-operations.md

store-creation.md

tile.json