CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/maven-com-embabel-agent--embabel-agent-test-support

Multi-module test support framework for Embabel Agent applications providing integration testing, mock AI services, and test configuration utilities

Overview
Eval results
Files

testing-embeddings.mddocs/guides/

Testing Embeddings Guide

Step-by-step guide for testing embedding operations without making real API calls.

What is FakeEmbeddingModel?

FakeEmbeddingModel is a fake implementation of Spring AI's EmbeddingModel interface that generates random embeddings. This allows you to test embedding-related code without requiring API keys or making network calls.

Prerequisites

  • embabel-agent-test-common dependency installed
  • Basic understanding of embeddings and Spring AI

Basic Usage Pattern

Step 1: Create Fake Embedding Model

import com.embabel.common.test.ai.FakeEmbeddingModel

val embeddingModel = FakeEmbeddingModel(dimensions = 1536)

Step 2: Use in Your Tests

val document = Document("test content")
val embedding = embeddingModel.embed(document)
// Returns random 1536-dimensional embedding

Step 3: Assert on Structure

assertEquals(1536, embedding.size)
assertTrue(embedding.all { it.isFinite() })

Single Document Embedding

Basic Example

@Test
fun `test document embedding`() {
    // Create fake model with default dimensions
    val model = FakeEmbeddingModel()

    // Embed a document
    val document = Document("Sample text for embedding")
    val embedding = model.embed(document)

    // Assert on embedding structure
    assertEquals(1536, embedding.size)
    assertTrue(embedding.isNotEmpty())
}

Custom Dimensions

@Test
fun `test with custom dimensions`() {
    // Create model with 768 dimensions (BERT size)
    val model = FakeEmbeddingModel(dimensions = 768)

    val document = Document("test")
    val embedding = model.embed(document)

    assertEquals(768, embedding.size)
}

Java Example

@Test
void testDocumentEmbedding() {
    // Create fake model
    FakeEmbeddingModel model = new FakeEmbeddingModel(1536);

    // Embed document
    Document document = new Document("test content");
    float[] embedding = model.embed(document);

    // Assert
    assertEquals(1536, embedding.length);
}

Batch Embedding

Multiple Texts

@Test
fun `test batch embedding`() {
    val model = FakeEmbeddingModel(dimensions = 512)

    // Embed multiple texts
    val texts = listOf(
        "First document",
        "Second document",
        "Third document"
    )
    val embeddings = model.embed(texts)

    // Assert on batch results
    assertEquals(3, embeddings.size)
    embeddings.forEach { embedding ->
        assertEquals(512, embedding.size)
    }
}

Large Batch

@Test
fun `test large batch embedding`() {
    val model = FakeEmbeddingModel()

    // Generate many texts
    val texts = (1..100).map { "Document $it" }

    val embeddings = model.embed(texts)

    assertEquals(100, embeddings.size)
    embeddings.forEach {
        assertEquals(1536, it.size)
    }
}

Using EmbeddingRequest

Basic Request

@Test
fun `test with embedding request`() {
    val model = FakeEmbeddingModel(dimensions = 384)

    // Create request
    val request = EmbeddingRequest(
        listOf("query 1", "query 2"),
        null  // options
    )

    // Call model
    val response = model.call(request)

    // Assert on response
    assertEquals(2, response.results.size)

    response.results.forEachIndexed { index, embedding ->
        assertEquals(index, embedding.index)
        assertEquals(384, embedding.output.size)
    }
}

With Options

@Test
fun `test with embedding options`() {
    val model = FakeEmbeddingModel()

    val options = EmbeddingOptions.builder()
        .withModel("fake-model")
        .build()

    val request = EmbeddingRequest(
        listOf("text"),
        options
    )

    val response = model.call(request)

    assertNotNull(response)
    assertEquals(1, response.results.size)
}

Testing Services with Embeddings

Service Integration

class EmbeddingService(private val embeddingModel: EmbeddingModel) {
    fun embedDocument(text: String): FloatArray {
        return embeddingModel.embed(Document(text))
    }
}

@Test
fun `test embedding service`() {
    // Use fake model
    val fakeModel = FakeEmbeddingModel(dimensions = 768)
    val service = EmbeddingService(fakeModel)

    // Test service
    val embedding = service.embedDocument("test text")

    assertEquals(768, embedding.size)
}

Vector Store Testing

@Test
fun `test vector store with fake embeddings`() {
    val embeddingModel = FakeEmbeddingModel()
    val vectorStore = SimpleVectorStore(embeddingModel)

    // Add documents
    vectorStore.add(listOf(
        Document("doc1"),
        Document("doc2"),
        Document("doc3")
    ))

    // Search (won't be semantically meaningful but tests structure)
    val results = vectorStore.search("query", k = 2)

    assertEquals(2, results.size)
}

Semantic Search Testing

@Test
fun `test semantic search structure`() {
    val embeddingModel = FakeEmbeddingModel(dimensions = 512)
    val searchEngine = SemanticSearchEngine(embeddingModel)

    // Index documents
    val documents = listOf(
        "Machine learning is a subset of AI",
        "Deep learning uses neural networks",
        "Python is a programming language"
    )
    searchEngine.indexDocuments(documents)

    // Perform search
    val results = searchEngine.search("AI and ML", topK = 2)

    // Assert on structure (not semantic quality)
    assertEquals(2, results.size)
    results.forEach { result ->
        assertNotNull(result.document)
        assertTrue(result.score >= 0.0)
    }
}

Testing Different Dimension Sizes

Common Dimensions

@Test
fun `test common embedding dimensions`() {
    // OpenAI ada-002: 1536
    val openAiModel = FakeEmbeddingModel(dimensions = 1536)
    assertEquals(1536, openAiModel.embed(Document("test")).size)

    // BERT base: 768
    val bertModel = FakeEmbeddingModel(dimensions = 768)
    assertEquals(768, bertModel.embed(Document("test")).size)

    // Sentence transformers: 384
    val sentenceModel = FakeEmbeddingModel(dimensions = 384)
    assertEquals(384, sentenceModel.embed(Document("test")).size)
}

Testing Dimension Compatibility

@Test
fun `test service handles different dimensions`() {
    val service = EmbeddingProcessingService()

    // Test with various dimensions
    val dims = listOf(128, 256, 512, 768, 1024, 1536, 2048)

    dims.forEach { dim ->
        val model = FakeEmbeddingModel(dimensions = dim)
        val result = service.process("text", model)

        assertNotNull(result)
        assertEquals(dim, result.embedding.size)
    }
}

Spring Integration

With TestConfiguration

@SpringBootTest
class EmbeddingIntegrationTest {

    @TestConfiguration
    class TestConfig {
        @Bean
        fun embeddingModel(): EmbeddingModel {
            return FakeEmbeddingModel(dimensions = 768)
        }
    }

    @Autowired
    private lateinit var embeddingModel: EmbeddingModel

    @Test
    fun `test with Spring bean`() {
        val embedding = embeddingModel.embed(Document("test"))
        assertEquals(768, embedding.size)
    }
}

With FakeAiConfiguration

@SpringBootTest
@Import(FakeAiConfiguration::class)
class EmbeddingServiceTest {

    @Autowired
    private lateinit var embeddingService: EmbeddingService

    @Test
    fun `test embedding service`() {
        val embedding = embeddingService.embed("test text")

        assertNotNull(embedding)
        assertEquals(1536, embedding.size)  // Default dimensions
    }
}

Custom Configuration

@SpringBootTest
class CustomEmbeddingTest {

    @TestConfiguration
    class TestConfig {
        @Bean
        @Primary
        fun customEmbedding(): EmbeddingService {
            val fakeModel = FakeEmbeddingModel(dimensions = 384)
            return SpringAiEmbeddingService(
                fakeModel,
                "custom-embedding-model",
                "CustomProvider"
            )
        }
    }

    @Autowired
    private lateinit var embeddingService: EmbeddingService

    @Test
    fun `test with custom configuration`() {
        val embedding = embeddingService.embed("test")
        assertEquals(384, embedding.size)
    }
}

Advanced Patterns

Pattern: Document Metadata Testing

@Test
fun `test embedding with document metadata`() {
    val model = FakeEmbeddingModel()

    val document = Document(
        "content",
        mapOf(
            "source" to "test",
            "timestamp" to System.currentTimeMillis()
        )
    )

    val embedding = model.embed(document)

    // Embedding structure is correct
    assertEquals(1536, embedding.size)
    // Metadata is preserved in document (not in embedding)
    assertEquals("test", document.metadata["source"])
}

Pattern: Pipeline Testing

@Test
fun `test embedding pipeline`() {
    val model = FakeEmbeddingModel(dimensions = 512)

    // Create processing pipeline
    val pipeline = DocumentPipeline(model)

    val documents = listOf(
        Document("doc1"),
        Document("doc2"),
        Document("doc3")
    )

    // Process through pipeline
    val processedDocs = pipeline.process(documents)

    // Verify all documents were embedded
    assertEquals(3, processedDocs.size)
    processedDocs.forEach { doc ->
        assertNotNull(doc.embedding)
        assertEquals(512, doc.embedding?.size)
    }
}

Pattern: Similarity Testing Structure

@Test
fun `test similarity calculation structure`() {
    val model = FakeEmbeddingModel()

    val text1 = "First text"
    val text2 = "Second text"

    val emb1 = model.embed(Document(text1))
    val emb2 = model.embed(Document(text2))

    // Calculate similarity (cosine, euclidean, etc.)
    val similarity = cosineSimilarity(emb1, emb2)

    // Assert similarity is in valid range
    assertTrue(similarity >= -1.0 && similarity <= 1.0)
}

Important Considerations

Random Embeddings

  • Embeddings are random, not semantically meaningful
  • Each call generates new random values
  • Don't test semantic similarity with fake embeddings
  • Do test structure, storage, retrieval, and processing

What to Test

✓ Embedding storage and retrieval ✓ Vector database integration ✓ Document processing pipelines ✓ Dimension compatibility ✓ Error handling ✓ Batch processing ✓ API integration structure

✗ Semantic similarity quality ✗ Actual embedding model behavior ✗ Real-world search relevance

Performance Testing

Fake embeddings are fast - good for performance testing:

@Test
fun `test embedding performance`() {
    val model = FakeEmbeddingModel()

    val startTime = System.currentTimeMillis()

    // Embed many documents
    val texts = (1..10000).map { "Document $it" }
    val embeddings = model.embed(texts)

    val duration = System.currentTimeMillis() - startTime

    assertEquals(10000, embeddings.size)
    println("Embedded 10k documents in ${duration}ms")
}

Troubleshooting

Wrong Dimensions

Problem: Embeddings have unexpected dimensions.

Solution: Check model initialization:

// Explicit dimensions
val model = FakeEmbeddingModel(dimensions = 768)
assertEquals(768, model.embed(Document("test")).size)

Type Mismatches

Problem: Java/Kotlin type compatibility issues.

Solution: Use appropriate types:

// Kotlin: MutableList<FloatArray>
val embeddings: MutableList<FloatArray> = model.embed(texts)

// Java: List<float[]>
List<float[]> embeddings = model.embed(texts);

Spring Bean Not Found

Problem: EmbeddingService bean not available.

Solution: Import FakeAiConfiguration or create bean:

@SpringBootTest
@Import(FakeAiConfiguration::class)  // Add this
class MyTest { ... }

Next Steps

  • Spring Test Setup Guide - Configure Spring tests
  • FakeEmbeddingModel API - Complete API reference
  • Test Configuration Module - Module details
  • Testing Patterns - More patterns

Related Topics

tessl i tessl/maven-com-embabel-agent--embabel-agent-test-support@0.3.0

docs

index.md

tile.json