CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-transformers

ONNX-based Transformer models for text embeddings within the Spring AI framework

Overview
Eval results
Files

index.mddocs/

Spring AI Transformers

ONNX-based Transformer models for text embeddings within the Spring AI framework. Provides local, CPU and GPU-accelerated embedding generation using ONNX Runtime, with zero external API dependencies for inference.

Package Information

  • Package: org.springframework.ai:spring-ai-transformers
  • Version: 1.1.2
  • Language: Java
  • Type: Maven

Installation

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-transformers</artifactId>
    <version>1.1.2</version>
</dependency>
implementation 'org.springframework.ai:spring-ai-transformers:1.1.2'

Quick Start

Spring Boot Auto-Configuration

@Service
public class MyService {
    private final TransformersEmbeddingModel embeddingModel;

    public MyService(TransformersEmbeddingModel embeddingModel) {
        this.embeddingModel = embeddingModel;
    }

    public float[] embed(String text) {
        return embeddingModel.embed(text);
    }
}

Manual Initialization

TransformersEmbeddingModel model = new TransformersEmbeddingModel();
model.afterPropertiesSet(); // Required before use
float[] embedding = model.embed("Hello world");

Core Concepts

Architecture

  • TransformersEmbeddingModel: Main API implementing Spring AI's EmbeddingModel interface
  • ONNX Runtime: Executes pre-trained transformer models
  • HuggingFace Tokenizers: Converts text to token IDs
  • ResourceCacheService: Caches models and tokenizers locally
  • Default Model: all-MiniLM-L6-v2 (384 dimensions)

Key Features

  • Local inference (no external API calls)
  • CPU and GPU support
  • Automatic resource caching
  • Spring Boot auto-configuration
  • Batch processing
  • Micrometer observation support

API Quick Reference

Embedding Operations

// Single text
float[] embed(String text);

// Document with metadata
float[] embed(Document document);

// Batch processing
List<float[]> embed(List<String> texts);

// With response metadata
EmbeddingResponse embedForResponse(List<String> texts);

// Full request
EmbeddingResponse call(EmbeddingRequest request);

// Get dimensions
int dimensions();

Configuration

// Model and tokenizer
void setModelResource(String modelResourceUri);
void setTokenizerResource(String tokenizerResourceUri);
void setTokenizerOptions(Map<String, String> tokenizerOptions);

// Hardware
void setGpuDeviceId(int gpuDeviceId);

// Caching
void setDisableCaching(boolean disableCaching);
void setResourceCacheDirectory(String resourceCacheDir);

// Model output
void setModelOutputName(String modelOutputName);

// Initialization (required)
void afterPropertiesSet() throws Exception;

Spring Boot Properties

# Model configuration
spring.ai.embedding.transformer.onnx.model-uri=classpath:/models/model.onnx
spring.ai.embedding.transformer.onnx.gpu-device-id=0
spring.ai.embedding.transformer.onnx.model-output-name=last_hidden_state

# Tokenizer configuration
spring.ai.embedding.transformer.tokenizer.uri=classpath:/tokenizers/tokenizer.json
spring.ai.embedding.transformer.tokenizer.options.modelMaxLength=512

# Cache configuration
spring.ai.embedding.transformer.cache.enabled=true
spring.ai.embedding.transformer.cache.directory=/var/cache/spring-ai

# Metadata mode
spring.ai.embedding.transformer.metadata-mode=NONE

Core Imports

import org.springframework.ai.transformers.TransformersEmbeddingModel;
import org.springframework.ai.transformers.ResourceCacheService;
import org.springframework.ai.embedding.EmbeddingResponse;
import org.springframework.ai.embedding.EmbeddingRequest;
import org.springframework.ai.document.Document;
import org.springframework.ai.document.MetadataMode;

Default Behavior

SettingDefault Value
Modelall-MiniLM-L6-v2
Dimensions384
HardwareCPU (GPU via setGpuDeviceId(0))
Cache Location{java.io.tmpdir}/spring-ai-onnx-generative
Metadata ModeNONE
CachingEnabled

Common Patterns

Singleton Pattern (Recommended)

@Configuration
public class EmbeddingConfig {
    @Bean
    public TransformersEmbeddingModel embeddingModel() throws Exception {
        TransformersEmbeddingModel model = new TransformersEmbeddingModel();
        model.afterPropertiesSet();
        return model;
    }
}

GPU with Fallback

TransformersEmbeddingModel model = new TransformersEmbeddingModel();
model.setGpuDeviceId(0);
try {
    model.afterPropertiesSet();
} catch (Exception e) {
    if (e.getMessage().contains("GPU")) {
        model.setGpuDeviceId(-1); // Fallback to CPU
        model.afterPropertiesSet();
    }
}

Batch Processing

List<String> texts = List.of("text1", "text2", "text3");
List<float[]> embeddings = model.embed(texts); // More efficient than loop

Key Types

// Main model class
public class TransformersEmbeddingModel extends AbstractEmbeddingModel implements InitializingBean

// Resource caching
public class ResourceCacheService

// Metadata handling
public enum MetadataMode { NONE, EMBED, ALL }

// Response types
public class EmbeddingResponse
public class Embedding
public class EmbeddingRequest

Error Handling

// Initialization errors
try {
    model.afterPropertiesSet();
} catch (Exception e) {
    // Handle: model loading, GPU, cache, network errors
}

// Runtime errors
try {
    float[] embedding = model.embed(text);
} catch (IllegalArgumentException e) {
    // Null parameters
} catch (RuntimeException e) {
    // ONNX inference errors
}

Performance Characteristics

  • First call: Includes model download (~50MB) and initialization (100-500ms)
  • Subsequent calls: 5-20ms (CPU), 2-5ms (GPU)
  • Batch processing: 30-50% faster than sequential
  • Memory footprint: ~170MB per model instance
  • Thread-safe: After initialization

Documentation

Guides

  • Quick Start Guide - Detailed getting started
  • Configuration Guide - Complete configuration options
  • Integration Guide - Spring Boot integration

Examples

Reference

Troubleshooting

Auto-Configuration Not Working

# Enable debug logging
debug=true
logging.level.org.springframework.boot.autoconfigure=DEBUG

GPU Initialization Fails

# Fallback to CPU
spring.ai.embedding.transformer.onnx.gpu-device-id=-1

Slow Startup

# Use local resources
spring.ai.embedding.transformer.onnx.model-uri=classpath:/models/model.onnx
spring.ai.embedding.transformer.cache.directory=/persistent/cache

Version Compatibility

  • Spring AI: 1.x.x
  • Spring Boot: 3.x.x (for auto-configuration)
  • Java: 17+
  • ONNX Runtime: Compatible with ONNX opset 13+
tessl i tessl/maven-org-springframework-ai--spring-ai-transformers@1.1.1

docs

index.md

tile.json