Common vector store functionality for Spring AI providing a portable abstraction layer for integrating vector databases with comprehensive filtering, similarity search, and observability support.
The Spring AI Vector Store module provides a comprehensive abstraction layer for integrating vector databases into Spring AI applications. It defines the core VectorStore interface for managing documents with embeddings, performing similarity searches, and filtering results based on metadata. The module includes an in-memory implementation (SimpleVectorStore), a sophisticated filter expression system with SQL-like syntax, and built-in observability support through Micrometer. It provides portable APIs that work across different vector database providers in the Spring AI ecosystem.
pom.xml:<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-vector-store</artifactId>
<version>1.1.2</version>
</dependency>Or Gradle build.gradle:
implementation 'org.springframework.ai:spring-ai-vector-store:1.1.2'Essential imports for vector store operations:
// Core vector store interfaces and classes
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SimpleVectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.document.Document;
// Filter expression system
import org.springframework.ai.vectorstore.filter.Filter;
import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
import org.springframework.ai.vectorstore.filter.FilterExpressionTextParser;
// Observation and metrics
import org.springframework.ai.vectorstore.observation.AbstractObservationVectorStore;
import org.springframework.ai.vectorstore.observation.VectorStoreObservationContext;
import org.springframework.ai.vectorstore.observation.DefaultVectorStoreObservationConvention;
import io.micrometer.observation.ObservationRegistry;
// Configuration
import org.springframework.ai.vectorstore.properties.CommonVectorStoreProperties;
import org.springframework.ai.vectorstore.SpringAIVectorStoreTypes;import org.springframework.ai.embedding.EmbeddingModel;
import org.springframework.ai.vectorstore.SimpleVectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.ai.document.Document;
import java.util.List;
import java.util.Map;
// Create a SimpleVectorStore instance
SimpleVectorStore vectorStore = SimpleVectorStore.builder(embeddingModel).build();
// Add documents with metadata
List<Document> documents = List.of(
new Document("Spring Framework documentation", Map.of("category", "framework", "year", 2024)),
new Document("Java programming guide", Map.of("category", "language", "year", 2023)),
new Document("Vector databases overview", Map.of("category", "database", "year", 2024))
);
vectorStore.add(documents);
// Perform similarity search
SearchRequest request = SearchRequest.builder()
.query("What is Spring?")
.topK(3)
.similarityThreshold(0.7)
.build();
List<Document> results = vectorStore.similaritySearch(request);
// Search with metadata filtering using SQL-like syntax
SearchRequest filteredRequest = SearchRequest.builder()
.query("programming concepts")
.topK(5)
.similarityThreshold(0.6)
.filterExpression("year >= 2024 && category == 'framework'")
.build();
List<Document> filteredResults = vectorStore.similaritySearch(filteredRequest);
// Delete documents by ID
vectorStore.delete(List.of("doc-id-1", "doc-id-2"));
// Delete documents by filter
vectorStore.delete("category == 'outdated'");import java.io.File;
import java.io.IOException;
// Save vector store state to file
vectorStore.save(new File("vectorstore.json"));
// Load vector store state from file
SimpleVectorStore loadedStore = SimpleVectorStore.builder(embeddingModel).build();
try {
loadedStore.load(new File("vectorstore.json"));
} catch (IOException e) {
// Handle file not found or parsing errors
}
// Load from classpath resource
import org.springframework.core.io.ClassPathResource;
loadedStore.load(new ClassPathResource("data/vectorstore.json"));import org.springframework.ai.vectorstore.filter.FilterExpressionBuilder;
FilterExpressionBuilder b = new FilterExpressionBuilder();
// Complex filter: (year >= 2023 OR featured == true) AND status != 'archived'
SearchRequest advancedRequest = SearchRequest.builder()
.query("machine learning")
.topK(10)
.similarityThreshold(0.75)
.filterExpression(b.and(
b.group(b.or(
b.gte("year", 2023),
b.eq("featured", true)
)),
b.ne("status", "archived")
).build())
.build();
List<Document> advancedResults = vectorStore.similaritySearch(advancedRequest);The Spring AI Vector Store module is built around several key design components:
VectorStore Interface: Main contract combining DocumentWriter and VectorStoreRetriever capabilities, providing operations for adding, deleting, and searching documents based on embedding similarity. Extends both interfaces to provide a unified API for CRUD operations on vector data.
SearchRequest: Immutable configuration object for similarity search parameters including query text, top-K results, similarity threshold, and metadata filters. Built using the builder pattern for type-safe construction. Default values: topK=4, similarityThreshold=0.0 (accept all).
SimpleVectorStore: Reference implementation providing in-memory vector storage with JSON persistence, suitable for testing and small-scale deployments. Uses cosine similarity for vector comparison and supports SpEL-based filter expressions.
Portable Filter DSL: Store-agnostic filter expression model using records (Filter.Expression, Filter.Key, Filter.Value, Filter.Group). Supports comparison operators (EQ, NE, GT, GTE, LT, LTE), logical operators (AND, OR, NOT), inclusion checks (IN, NIN), and null checks (ISNULL, ISNOTNULL).
FilterExpressionBuilder: Programmatic DSL for building type-safe filter expressions with method chaining. Provides fluent API methods like eq(), and(), or(), in(), etc. Returns Op wrapper objects that can be chained and built into Filter.Expression.
FilterExpressionTextParser: ANTLR4-based parser converting SQL-like text expressions (e.g., "year >= 2020 && country == 'US'") into portable Filter.Expression objects. Supports string literals (single quotes), numeric literals, boolean literals (true/false), array syntax for IN/NIN, and parentheses for grouping.
FilterExpressionConverter: Strategy pattern interface for converting portable expressions to provider-specific formats. Implementations include:
SimpleVectorStoreFilterExpressionConverter: Converts to SpEL (Spring Expression Language) for in-memory filteringPineconeFilterExpressionConverter: Converts to Pinecone JSON metadata filter formatPrintFilterExpressionConverter: Converts to human-readable debug formatAbstractObservationVectorStore: Template base class adding Micrometer observation support to all VectorStore implementations. Provides template methods doAdd(), doDelete(), and doSimilaritySearch() that subclasses implement. Automatically wraps operations in observation contexts for metrics collection.
VectorStoreObservationContext: Metadata container for tracking vector store operations (add, delete, query) with metrics like dimensions, similarity metric, query parameters, and response data. Includes fields for collection name, namespace, field name, top-K, filter expression, similarity threshold, and response documents.
VectorStoreObservationConvention: Customizable convention for defining observation names and key-value pairs, enabling integration with distributed tracing and metrics systems. Default implementation provides standard low-cardinality keys (db.system, db.operation.name, similarity metric) and high-cardinality keys (collection name, query details, response data).
All major components (VectorStore, SearchRequest, VectorStoreObservationContext) use fluent builder APIs for configuration, enabling clear, type-safe instantiation with optional parameters. Builders follow the self-typing pattern to maintain type safety across inheritance hierarchies.
Core vector database operations including creating stores, adding/deleting documents, performing similarity searches, and managing embeddings. The VectorStore interface provides a portable abstraction that works across multiple vector database providers.
package org.springframework.ai.vectorstore;
/**
* Main interface for managing and querying documents in a vector database.
* Extends DocumentWriter and VectorStoreRetriever for complete CRUD operations.
*/
public interface VectorStore extends DocumentWriter, VectorStoreRetriever {
/**
* Returns the name of this vector store implementation.
* Defaults to the simple class name.
*/
default String getName();
/**
* Adds documents to the vector store.
* Documents will be embedded using the configured EmbeddingModel.
* If a document already has an embedding, it will be used; otherwise,
* the embedding model will generate one from the document's text content.
*
* @param documents List of Document objects to store
* @throws RuntimeException if the provider detects duplicate IDs
*/
void add(List<Document> documents);
/**
* Deletes documents by their IDs.
* If an ID doesn't exist, it is silently ignored (no error thrown).
*
* @param idList List of document IDs to delete
*/
void delete(List<String> idList);
/**
* Deletes documents matching the filter expression.
* Uses portable Filter.Expression for store-agnostic filtering.
*
* @param filterExpression Filter.Expression defining which documents to delete
* @throws IllegalStateException if the underlying delete operation fails
* @throws UnsupportedOperationException if filter-based deletion is not supported
*/
void delete(Filter.Expression filterExpression);
/**
* Deletes documents using a SQL-like filter string.
* Converts the string to a Filter.Expression and delegates to delete(Expression).
*
* @param filterExpression String filter like "year >= 2020 && category == 'old'"
* @throws IllegalArgumentException if the filter expression is null
* @throws IllegalStateException if the underlying delete operation fails
*/
default void delete(String filterExpression);
/**
* Performs similarity search using SearchRequest configuration.
* Returns documents ordered by similarity score (highest first).
*
* @param request SearchRequest with query, topK, threshold, and filter parameters
* @return List of similar documents ordered by similarity (highest first)
*/
List<Document> similaritySearch(SearchRequest request);
/**
* Convenience method for simple text-based similarity search.
* Uses default topK (4) and no filtering.
*
* @param query The text query to search for
* @return List of similar documents
*/
default List<Document> similaritySearch(String query);
/**
* Returns the native vector store client if available.
* Due to type erasure, callers should specify the expected client type.
*
* @param <T> The type of the native client
* @return Optional containing the native client, or empty if unavailable
*/
default <T> Optional<T> getNativeClient();
}
/**
* Configuration for similarity search requests.
* Use SearchRequest.builder() to create instances.
*/
public class SearchRequest {
/**
* Similarity threshold that accepts all scores (0.0 means no threshold filtering).
*/
public static final double SIMILARITY_THRESHOLD_ACCEPT_ALL = 0.0;
/**
* Default number of top results to return.
*/
public static final int DEFAULT_TOP_K = 4;
/**
* Returns the text query for embedding similarity comparison.
*/
public String getQuery();
/**
* Returns the number of top similar results to return.
*/
public int getTopK();
/**
* Returns the minimum similarity threshold for filtering results.
* Only documents with similarity >= threshold are returned.
*/
public double getSimilarityThreshold();
/**
* Returns the filter expression for metadata filtering.
*/
public Filter.Expression getFilterExpression();
/**
* Checks if a filter expression is present.
*/
public boolean hasFilterExpression();
/**
* Creates a new SearchRequest builder.
*/
public static Builder builder();
/**
* Creates a builder initialized with values from an existing SearchRequest.
*/
public static Builder from(SearchRequest originalSearchRequest);
/**
* Builder for SearchRequest with fluent API.
*/
public static final class Builder {
/**
* Sets the text query for embedding similarity comparison.
*
* @param query Text to convert to embedding and compare against stored documents
* @return This builder for chaining
* @throws IllegalArgumentException if query is null
*/
public Builder query(String query);
/**
* Sets the number of top similar results to return.
*
* @param topK Number of results (must be >= 0)
* @return This builder for chaining
* @throws IllegalArgumentException if topK is negative
*/
public Builder topK(int topK);
/**
* Sets the minimum similarity threshold for filtering results.
* Only documents with similarity >= threshold are returned.
* This is a client-side post-processing filter applied after retrieval.
*
* @param threshold Similarity threshold in range [0.0, 1.0]
* 0.0 = accept all, 1.0 = exact match required
* @return This builder for chaining
* @throws IllegalArgumentException if threshold not in [0.0, 1.0]
*/
public Builder similarityThreshold(double threshold);
/**
* Disables similarity threshold filtering by setting it to 0.0.
*
* @return This builder for chaining
*/
public Builder similarityThresholdAll();
/**
* Sets a programmatic filter expression for metadata filtering.
* The filter is portable across all vector store implementations.
*
* @param expression Filter.Expression or null for no filtering
* @return This builder for chaining
*/
public Builder filterExpression(@Nullable Filter.Expression expression);
/**
* Sets a SQL-like text filter expression for metadata filtering.
* Examples:
* - "country == 'UK' && year >= 2020"
* - "category IN ['tech', 'science'] && active == true"
* - "(year > 2020 OR featured == true) && status != 'archived'"
*
* @param textExpression SQL-like filter string or null for no filtering
* @return This builder for chaining
*/
public Builder filterExpression(@Nullable String textExpression);
/**
* Builds the immutable SearchRequest.
*/
public SearchRequest build();
}
}
/**
* Simple in-memory vector store implementation.
* Suitable for testing, development, and small-scale deployments.
* Supports saving/loading state to/from JSON files.
*/
public class SimpleVectorStore extends AbstractObservationVectorStore {
/**
* Creates a builder for SimpleVectorStore.
*
* @param embeddingModel The embedding model to use for document vectorization
* @return SimpleVectorStoreBuilder instance
*/
public static SimpleVectorStoreBuilder builder(EmbeddingModel embeddingModel);
/**
* Saves the current vector store state to a JSON file.
* The file contains all documents with their embeddings and metadata.
*
* @param file Target file for saving
*/
public void save(File file);
/**
* Loads vector store state from a JSON file.
* Replaces all existing documents in the store.
*
* @param file Source file to load from
* @throws IOException if reading or parsing fails
*/
public void load(File file) throws IOException;
/**
* Loads vector store state from a Spring Resource.
* Supports classpath resources, file system resources, and URL resources.
*
* @param resource Spring Resource to load from
* @throws IOException if reading or parsing fails
*/
public void load(Resource resource) throws IOException;
/**
* Mathematical operations for embedding vectors.
*/
public static class EmbeddingMath {
/**
* Computes cosine similarity between two embedding vectors.
* Returns a value in range [-1.0, 1.0] where 1.0 is identical direction,
* 0.0 is orthogonal, and -1.0 is opposite direction.
*
* @param vectorX First embedding vector
* @param vectorY Second embedding vector
* @return Cosine similarity score
*/
public static double cosineSimilarity(float[] vectorX, float[] vectorY);
/**
* Computes dot product of two embedding vectors.
*
* @param vectorX First embedding vector
* @param vectorY Second embedding vector
* @return Dot product result
*/
public static float dotProduct(float[] vectorX, float[] vectorY);
/**
* Computes the Euclidean norm (magnitude) of an embedding vector.
*
* @param vector Embedding vector
* @return Vector norm (magnitude)
*/
public static float norm(float[] vector);
}
}Sophisticated metadata filtering system with portable expressions that work across all vector store implementations. Supports SQL-like syntax with comparison operators (==, !=, <, <=, >, >=), logical operators (AND, OR, NOT), inclusion checks (IN, NIN), and null checks (IS NULL, IS NOT NULL).
package org.springframework.ai.vectorstore.filter;
/**
* Container for portable filter expression components.
* Filters are store-agnostic and converted to provider-specific formats at runtime.
*/
public class Filter {
/**
* Filter expression operation types.
*/
public enum ExpressionType {
/** Logical AND: combines two expressions, both must be true */
AND,
/** Logical OR: combines two expressions, at least one must be true */
OR,
/** Equality: key equals value */
EQ,
/** Not equal: key does not equal value */
NE,
/** Greater than: key > value */
GT,
/** Greater than or equal: key >= value */
GTE,
/** Less than: key < value */
LT,
/** Less than or equal: key <= value */
LTE,
/** Inclusion: key value is in array of values */
IN,
/** Non-inclusion: key value is not in array of values */
NIN,
/** Logical negation: inverts the expression */
NOT,
/** Null check: key is null */
ISNULL,
/** Not null check: key is not null */
ISNOTNULL
}
/**
* Marker interface for all filter expression components.
*/
public interface Operand { }
/**
* Represents a metadata key in filter expressions.
*
* @param key The metadata key name (e.g., "country", "year", "isActive")
*/
public record Key(String key) implements Operand { }
/**
* Represents a constant value or array of values.
* Supports Numeric, Boolean, and String data types.
*
* @param value Single value or array (e.g., "US", 2020, true, List.of("A", "B"))
*/
public record Value(Object value) implements Operand { }
/**
* Represents a boolean filter expression as a triple: (type, left, right).
*
* For comparison expressions (EQ, NE, GT, etc.):
* - left must be a Key
* - right must be a Value
*
* For logical expressions (AND, OR):
* - left and right must be Expression or Group
*
* For unary expressions (NOT, ISNULL, ISNOTNULL):
* - left is the operand
* - right is null
*
* @param type The expression operation type
* @param left Left operand
* @param right Right operand (null for unary operations)
*/
public record Expression(ExpressionType type, Operand left, Operand right)
implements Operand {
/**
* Constructor for unary operations (NOT, ISNULL, ISNOTNULL).
* Sets right operand to null.
*
* @param type Expression type
* @param operand The operand to apply the operation to
*/
public Expression(ExpressionType type, Operand operand);
}
/**
* Represents expression grouping for precedence control (parentheses).
* The grouped expression is evaluated with higher precedence.
*
* @param content The inner expression to evaluate as a group
*/
public record Group(Expression content) implements Operand { }
}
/**
* Fluent DSL builder for constructing Filter.Expression instances programmatically.
* Provides a more intuitive API than manual Expression construction.
*/
public class FilterExpressionBuilder {
/**
* Creates equality expression: key == value
*/
public Op eq(String key, Object value);
/**
* Creates not-equal expression: key != value
*/
public Op ne(String key, Object value);
/**
* Creates greater-than expression: key > value
*/
public Op gt(String key, Object value);
/**
* Creates greater-than-or-equal expression: key >= value
*/
public Op gte(String key, Object value);
/**
* Creates less-than expression: key < value
*/
public Op lt(String key, Object value);
/**
* Creates less-than-or-equal expression: key <= value
*/
public Op lte(String key, Object value);
/**
* Combines two expressions with AND: left && right
*/
public Op and(Op left, Op right);
/**
* Combines two expressions with OR: left || right
*/
public Op or(Op left, Op right);
/**
* Negates an expression: NOT operand
*/
public Op not(Op operand);
/**
* Creates inclusion expression: key IN [values...]
*/
public Op in(String key, Object... values);
/**
* Creates non-inclusion expression: key NOT IN [values...]
*/
public Op nin(String key, Object... values);
/**
* Creates null check expression: key IS NULL
*/
public Op isNull(String key);
/**
* Creates not-null check expression: key IS NOT NULL
*/
public Op isNotNull(String key);
/**
* Groups an expression for precedence: (operand)
*/
public Op group(Op operand);
/**
* Wrapper for filter expressions that enables method chaining.
*/
public record Op(Filter.Expression expression) {
/**
* Extracts the underlying Filter.Expression.
*/
public Filter.Expression build();
}
}
/**
* Parser for SQL-like text filter expressions.
* Uses ANTLR4 grammar to parse text into Filter.Expression objects.
*/
public class FilterExpressionTextParser {
/**
* Parses a SQL-like text expression into a Filter.Expression.
*
* Supported syntax:
* - Comparison: ==, !=, <, <=, >, >=
* - Logical: &&, ||, NOT
* - Inclusion: IN, NOT IN (with array syntax)
* - Null checks: IS NULL, IS NOT NULL
* - Grouping: ( )
* - Literals: strings ('text'), numbers, booleans (true, false)
*
* @param textExpression SQL-like filter string
* @return Parsed Filter.Expression
* @throws FilterExpressionParseException if parsing fails
*/
public Filter.Expression parse(String textExpression);
/**
* Clears the internal parser cache.
* Call this if you need to free memory from cached parsing state.
*/
public void clearCache();
}Built-in observability support through Micrometer for monitoring vector store operations. Tracks metrics like query performance, document counts, embedding dimensions, and similarity thresholds. Integrates with distributed tracing systems for end-to-end visibility.
package org.springframework.ai.vectorstore.observation;
/**
* Abstract base implementation providing Micrometer observation support.
* Custom vector store implementations should extend this class and
* delegate to createObservation() for operation tracking.
*/
public abstract class AbstractObservationVectorStore implements VectorStore {
/**
* Constructor initializing observation support from builder configuration.
*
* @param builder AbstractVectorStoreBuilder with observation configuration
*/
protected AbstractObservationVectorStore(AbstractVectorStoreBuilder<?> builder);
/**
* Creates an observation for a vector store operation.
* Implementations should call this method to wrap operations with observability.
*
* @param context VectorStoreObservationContext with operation metadata
* @return Observation that can be started and stopped
*/
protected Observation createObservation(VectorStoreObservationContext context);
}
/**
* Observation context for vector store operations.
* Contains metadata about the operation for metrics collection and tracing.
*/
public class VectorStoreObservationContext extends Observation.Context {
/**
* Vector store operation types.
*/
public enum Operation {
/** Adding documents to vector store */
ADD("add"),
/** Deleting documents from vector store */
DELETE("delete"),
/** Querying/searching vector store */
QUERY("query");
public final String value;
}
/**
* Creates observation context for a vector store operation.
*
* @param databaseSystem Database system identifier (e.g., "pinecone", "chroma", "simple")
* @param operationName Operation name ("add", "delete", "query")
*/
public VectorStoreObservationContext(String databaseSystem, String operationName);
/**
* Creates a builder for VectorStoreObservationContext.
*
* @param databaseSystem Database system identifier
* @param operationName Operation name
* @return Builder instance
*/
public static Builder builder(String databaseSystem, String operationName);
/**
* Creates a builder using an Operation enum.
*
* @param databaseSystem Database system identifier
* @param operation Operation enum value
* @return Builder instance
*/
public static Builder builder(String databaseSystem, Operation operation);
// Getters and setters for context properties
public String getDatabaseSystem();
public String getOperationName();
public String getCollectionName();
public void setCollectionName(String collectionName);
public Integer getDimensions();
public void setDimensions(Integer dimensions);
public String getSimilarityMetric();
public void setSimilarityMetric(String similarityMetric);
public SearchRequest getQueryRequest();
public void setQueryRequest(SearchRequest queryRequest);
public List<Document> getQueryResponse();
public void setQueryResponse(List<Document> queryResponse);
}
/**
* Convention for creating observations from VectorStoreObservationContext.
* Defines observation names and key-value pairs for metrics.
*/
public interface VectorStoreObservationConvention {
/**
* Returns the observation name.
* Used as the metric/span name in observability systems.
*/
String getName();
/**
* Returns contextual name for the observation.
* Typically includes operation name (e.g., "query documents").
*/
String getContextualName(VectorStoreObservationContext context);
/**
* Returns low cardinality key-values for metrics.
* These are dimensions with limited distinct values (e.g., operation type, db system).
*/
KeyValues getLowCardinalityKeyValues(VectorStoreObservationContext context);
/**
* Returns high cardinality key-values for metrics.
* These are dimensions with many distinct values (e.g., query text, document IDs).
*/
KeyValues getHighCardinalityKeyValues(VectorStoreObservationContext context);
}
/**
* Default observation convention for vector store operations.
* Provides standard naming and key-value structure for metrics.
*/
public class DefaultVectorStoreObservationConvention
implements VectorStoreObservationConvention {
/**
* Default name for vector store observations.
*/
public static final String DEFAULT_NAME = "db.vector.client.operation";
/**
* Default constructor using the default observation name.
*/
public DefaultVectorStoreObservationConvention();
/**
* Constructor with custom observation name.
*
* @param name Custom name for observations (overrides default)
*/
public DefaultVectorStoreObservationConvention(String name);
}Common configuration properties and native image support for Spring AI vector store implementations.
package org.springframework.ai.vectorstore.properties;
/**
* Common configuration properties for vector stores.
* Can be used with Spring Boot configuration properties binding.
*/
public class CommonVectorStoreProperties {
/**
* Whether to initialize the vector store schema on startup.
* Default: false
*
* When true, the vector store will create necessary tables,
* indexes, or collections on startup.
*/
private boolean initializeSchema = false;
public boolean isInitializeSchema();
public void setInitializeSchema(boolean initializeSchema);
}
/**
* Constants for vector store type identification.
* Used in observations, metrics, and logging to identify the underlying vector database provider.
*/
public class SpringAIVectorStoreTypes {
public static final String VECTOR_STORE_PREFIX = "spring.ai.vectorstore";
public static final String TYPE = VECTOR_STORE_PREFIX + ".type";
// Provider type constants
public static final String AZURE = "azure_ai_search";
public static final String CASSANDRA = "cassandra";
public static final String CHROMA = "chroma";
public static final String ELASTICSEARCH = "elasticsearch";
public static final String MILVUS = "milvus";
public static final String MONGODB_ATLAS = "mongodb_atlas";
public static final String NEO4J = "neo4j";
public static final String PGVECTOR = "pgvector";
public static final String PINECONE = "pinecone";
public static final String QDRANT = "qdrant";
public static final String REDIS = "redis";
public static final String SIMPLE = "simple";
public static final String WEAVIATE = "weaviate";
// ... and more provider types
}This module depends on several core Spring AI interfaces and classes:
Document: Core document class with ID, text, metadata, and optional embeddings. The primary data structure for storing and retrieving content. Supports similarity scores for search results.
Content: Interface representing content with text and metadata. Implemented by SimpleVectorStoreContent for internal storage.
EmbeddingModel: Interface for converting text to embedding vectors. Required by all VectorStore implementations. Provides embed(String text) and embed(List<String> texts) methods.
BatchingStrategy: Strategy interface for batching document operations (e.g., TokenCountBatchingStrategy). Configured via builder to optimize bulk operations.
DocumentWriter: Interface for writing documents to a store. VectorStore extends this interface to provide add() and accept() functionality.
VectorStoreRetriever: Functional interface for retrieving documents by similarity. VectorStore extends this for similaritySearch() functionality.
Resource: Spring Framework interface for abstracting resources (files, classpath, etc.). Used by SimpleVectorStore.load() for loading from various sources.
IdGenerator: Interface for generating document IDs (e.g., RandomIdGenerator). Used by SimpleVectorStoreContent constructors when IDs are not provided.
These interfaces and classes are part of the broader Spring AI framework and are documented separately.
import org.springframework.ai.vectorstore.SimpleVectorStore;
import io.micrometer.observation.ObservationRegistry;
// Configure vector store with full observability
SimpleVectorStore vectorStore = SimpleVectorStore.builder(embeddingModel)
.observationRegistry(observationRegistry)
.customObservationConvention(new DefaultVectorStoreObservationConvention())
.batchingStrategy(new TokenCountBatchingStrategy())
.build();// Use text filters for simple cases
SearchRequest simpleFilter = SearchRequest.builder()
.query("search text")
.filterExpression("year >= 2023 && status == 'active'")
.build();
// Use programmatic filters for complex, dynamic cases
FilterExpressionBuilder b = new FilterExpressionBuilder();
Filter.Expression dynamicFilter = b.and(
b.gte("year", currentYear),
b.in("category", userSelectedCategories.toArray())
).build();
SearchRequest complexFilter = SearchRequest.builder()
.query("search text")
.filterExpression(dynamicFilter)
.build();import java.io.IOException;
// Graceful loading with fallback
SimpleVectorStore store = SimpleVectorStore.builder(embeddingModel).build();
try {
store.load(new File("vectorstore.json"));
logger.info("Loaded existing vector store");
} catch (IOException e) {
logger.warn("Could not load vector store, starting fresh", e);
// Initialize with default documents if needed
store.add(getDefaultDocuments());
}
// Periodic persistence
try {
store.save(new File("vectorstore.json"));
} catch (Exception e) {
logger.error("Failed to save vector store", e);
// Consider alternative storage or alerting
}// Start with permissive threshold for exploration
SearchRequest exploratory = SearchRequest.builder()
.query("broad topic")
.topK(20)
.similarityThreshold(0.5) // Lower threshold for more results
.build();
// Use strict threshold for production relevance
SearchRequest production = SearchRequest.builder()
.query("specific query")
.topK(5)
.similarityThreshold(0.8) // Higher threshold for quality
.build();
// Accept all for debugging or analysis
SearchRequest debug = SearchRequest.builder()
.query("test query")
.topK(100)
.similarityThresholdAll() // No threshold filtering
.build();// Add documents with rich metadata for filtering
List<Document> documents = List.of(
new Document("content", Map.of(
"category", "technical",
"year", 2024,
"author", "John Doe",
"tags", List.of("java", "spring", "ai"),
"featured", true,
"lastUpdated", Instant.now().toString()
))
);
// Delete old documents using metadata filters
vectorStore.delete("year < 2020 OR status == 'archived'");
// Search with multi-dimensional filtering
SearchRequest metadataSearch = SearchRequest.builder()
.query("spring boot tutorial")
.filterExpression("category == 'technical' && year >= 2023 && featured == true")
.topK(10)
.build();// Some providers throw on duplicate IDs, others silently update
try {
vectorStore.add(documentsWithDuplicateIds);
} catch (RuntimeException e) {
// Handle duplicate ID error
// Consider generating new IDs or updating existing documents
}List<Document> results = vectorStore.similaritySearch(request);
if (results.isEmpty()) {
// No documents matched the query or passed the similarity threshold
// Consider:
// 1. Lowering the similarity threshold
// 2. Broadening the filter expression
// 3. Checking if documents exist in the store
}import org.springframework.ai.vectorstore.filter.FilterExpressionTextParser;
FilterExpressionTextParser parser = new FilterExpressionTextParser();
try {
Filter.Expression expr = parser.parse(userProvidedFilter);
} catch (FilterExpressionParseException e) {
// Invalid filter syntax
// Provide user feedback or use a default filter
logger.error("Invalid filter expression: {}", userProvidedFilter, e);
}// Not all vector stores support filter-based deletion
try {
vectorStore.delete("category == 'old'");
} catch (UnsupportedOperationException e) {
// Fall back to ID-based deletion
// First, query for matching documents, then delete by ID
SearchRequest findOld = SearchRequest.builder()
.query("*") // Match all
.filterExpression("category == 'old'")
.topK(1000)
.build();
List<String> idsToDelete = vectorStore.similaritySearch(findOld)
.stream()
.map(Document::getId)
.toList();
vectorStore.delete(idsToDelete);
}The BatchingStrategy interface allows customization of how documents are batched for embedding and storage operations. The default TokenCountBatchingStrategy optimizes for embedding model token limits.
import org.springframework.ai.document.BatchingStrategy;
import org.springframework.ai.document.TokenCountBatchingStrategy;
// Configure batching for large document sets
BatchingStrategy strategy = new TokenCountBatchingStrategy();
SimpleVectorStore store = SimpleVectorStore.builder(embeddingModel)
.batchingStrategy(strategy)
.build();
// Add large document set - will be automatically batched
store.add(largeDocumentList);Documents that already have embeddings will not be re-embedded:
// Document with pre-computed embedding
float[] existingEmbedding = embeddingModel.embed("text").getOutput();
Document docWithEmbedding = new Document("text", Map.of("key", "value"));
docWithEmbedding.setEmbedding(existingEmbedding);
// This will NOT call the embedding model
vectorStore.add(List.of(docWithEmbedding));SimpleVectorStore is suitable for:
For production use cases with large datasets, consider:
The portable API design allows easy migration between vector store implementations:
// Development with SimpleVectorStore
VectorStore devStore = SimpleVectorStore.builder(embeddingModel).build();
// Production with Pinecone (example - actual implementation varies)
VectorStore prodStore = PineconeVectorStore.builder(embeddingModel)
.apiKey(pineconeApiKey)
.environment("us-west1-gcp")
.indexName("production-index")
.build();
// Same API for both
SearchRequest request = SearchRequest.builder()
.query("search text")
.topK(10)
.filterExpression("category == 'tech'")
.build();
List<Document> devResults = devStore.similaritySearch(request);
List<Document> prodResults = prodStore.similaritySearch(request);import org.junit.jupiter.api.Test;
import static org.assertj.core.api.Assertions.assertThat;
@Test
void testVectorStoreSearch() {
// Use SimpleVectorStore for fast unit tests
VectorStore testStore = SimpleVectorStore.builder(mockEmbeddingModel).build();
testStore.add(List.of(
new Document("Spring Boot guide", Map.of("type", "tutorial")),
new Document("Java basics", Map.of("type", "reference"))
));
SearchRequest request = SearchRequest.builder()
.query("Spring framework")
.topK(1)
.build();
List<Document> results = testStore.similaritySearch(request);
assertThat(results).hasSize(1);
assertThat(results.get(0).getContent()).contains("Spring Boot");
}// For testing with real vector databases
@Testcontainers
class VectorStoreIntegrationTest {
@Container
static ChromaContainer chroma = new ChromaContainer();
@Test
void testWithRealVectorStore() {
VectorStore store = ChromaVectorStore.builder(embeddingModel)
.host(chroma.getHost())
.port(chroma.getPort())
.build();
// Test with real vector store
}
}Install with Tessl CLI
npx tessl i tessl/maven-org-springframework-ai--spring-ai-vector-store