CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/maven-com-embabel-agent--embabel-agent-test-support

Multi-module test support framework for Embabel Agent applications providing integration testing, mock AI services, and test configuration utilities

Overview
Eval results
Files

stubbing-llm-calls.mddocs/guides/

Stubbing LLM Calls Guide

Step-by-step guide for stubbing LLM operations in your tests.

What is Stubbing?

Stubbing allows you to control what the mocked LLM returns in your tests. Instead of making real API calls, you define the response the LLM should return for specific inputs.

Prerequisites

  • Test class extends EmbabelMockitoIntegrationTest
  • Basic understanding of predicates and lambda expressions

Basic Stubbing Pattern

Step 1: Identify the Operation

Determine what type of LLM operation your code performs:

OperationUse WhenStub Method
Text generationLLM returns plain textwhenGenerateText()
Object creationLLM returns structured objectwhenCreateObject()

Step 2: Write the Stub

// For text generation
whenGenerateText(prompt -> prompt.contains("keyword"))
    .thenReturn("Mocked response");

// For object creation
whenCreateObject(prompt -> prompt.contains("keyword"), OutputClass.class)
    .thenReturn(mockedObject);

Step 3: Execute Your Code

String result = myAgent.process("user input");

The stub intercepts the LLM call and returns your mocked response.

Stubbing Text Generation

Simple Text Stub

@Test
void testSimpleText() {
    // Stub: When prompt contains "summarize", return this text
    whenGenerateText(prompt -> prompt.contains("summarize"))
        .thenReturn("This is a summary of the document.");

    // Execute code that uses LLM
    String summary = myAgent.summarize(document);

    // Assert result
    assertEquals("This is a summary of the document.", summary);
}

Multiple Text Stubs

@Test
void testMultipleStubs() {
    // Stub first operation
    whenGenerateText(p -> p.contains("greet"))
        .thenReturn("Hello!");

    // Stub second operation
    whenGenerateText(p -> p.contains("farewell"))
        .thenReturn("Goodbye!");

    // Execute both operations
    String greeting = myAgent.generateGreeting();
    String farewell = myAgent.generateFarewell();

    // Assert both
    assertEquals("Hello!", greeting);
    assertEquals("Goodbye!", farewell);
}

Complex Predicate

@Test
void testComplexPredicate() {
    // Stub with multiple conditions
    whenGenerateText(prompt ->
        prompt.contains("analyze") &&
        prompt.length() > 50 &&
        prompt.toLowerCase().contains("data")
    ).thenReturn("Complex analysis result");

    String result = myAgent.analyzeData(largeDataset);

    assertEquals("Complex analysis result", result);
}

Stubbing Object Creation

Simple Object Stub

@Test
void testObjectCreation() {
    // Create expected object
    Person expectedPerson = new Person("Alice", 30);

    // Stub object creation
    whenCreateObject(
        prompt -> prompt.contains("extract person"),
        Person.class
    ).thenReturn(expectedPerson);

    // Execute code that extracts person
    Person result = myAgent.extractPerson("Alice is 30 years old");

    // Assert result
    assertEquals(expectedPerson, result);
}

Multiple Object Stubs

@Test
void testMultipleObjects() {
    Person person = new Person("Bob", 25);
    Address address = new Address("123 Main St");

    // Stub person extraction
    whenCreateObject(p -> p.contains("person"), Person.class)
        .thenReturn(person);

    // Stub address extraction
    whenCreateObject(p -> p.contains("address"), Address.class)
        .thenReturn(address);

    // Execute
    Person p = myAgent.extractPerson(text);
    Address a = myAgent.extractAddress(text);

    // Assert
    assertEquals(person, p);
    assertEquals(address, a);
}

Advanced Stubbing Techniques

Stub with Interaction Matching

Match on both prompt and model configuration:

@Test
void testWithInteraction() {
    // Stub only when using gpt-4 with temperature 0.0
    whenGenerateText(
        prompt -> prompt.contains("precise"),
        interaction ->
            interaction.getModel().equals("gpt-4") &&
            interaction.getTemperature() == 0.0
    ).thenReturn("Precise result");

    String result = myAgent.preciseModeAnalysis();

    assertEquals("Precise result", result);
}

Conditional Responses

Return different responses based on prompt content:

@Test
void testConditionalResponses() {
    // Stub for positive sentiment
    whenGenerateText(p -> p.contains("happy"))
        .thenReturn("Positive sentiment detected");

    // Stub for negative sentiment
    whenGenerateText(p -> p.contains("sad"))
        .thenReturn("Negative sentiment detected");

    String pos = myAgent.analyzeSentiment("I am happy");
    String neg = myAgent.analyzeSentiment("I am sad");

    assertEquals("Positive sentiment detected", pos);
    assertEquals("Negative sentiment detected", neg);
}

Exception Throwing

Stub to throw exceptions for error testing:

@Test
void testErrorHandling() {
    // Stub to throw exception
    whenGenerateText(p -> p.contains("error"))
        .thenThrow(new RuntimeException("LLM service unavailable"));

    // Assert exception is handled
    assertThrows(RuntimeException.class, () -> {
        myAgent.processWithError();
    });
}

Custom Answer Logic

Use .thenAnswer() for dynamic responses:

@Test
void testDynamicResponse() {
    whenGenerateText(p -> true).thenAnswer(invocation -> {
        String prompt = invocation.getArgument(0);
        return "You asked about: " + prompt.substring(0, 20);
    });

    String result = myAgent.process("long prompt text here");

    assertTrue(result.startsWith("You asked about:"));
}

Kotlin Examples

Simple Stub (Kotlin)

@Test
fun `test simple stub`() {
    // Kotlin lambda syntax
    whenGenerateText { it.contains("summarize") }
        .thenReturn("Summary result")

    val result = myAgent.summarize(document)

    assertEquals("Summary result", result)
}

Object Creation (Kotlin)

@Test
fun `test object creation`() {
    val person = Person("Alice", 30)

    whenCreateObject({ it.contains("extract") }, Person::class.java)
        .thenReturn(person)

    val result = myAgent.extractPerson(text)

    assertEquals(person, result)
}

Common Patterns

Pattern 1: Stub All Calls

// Catch-all stub for any prompt
whenGenerateText(p -> true)
    .thenReturn("Default response");

Pattern 2: Sequential Responses

// Return different responses for multiple calls
whenGenerateText(p -> p.contains("step"))
    .thenReturn("Step 1")
    .thenReturn("Step 2")
    .thenReturn("Step 3");

Pattern 3: Partial Match

// Match part of the prompt
whenGenerateText(p -> p.toLowerCase().contains("search"))
    .thenReturn("Search results");

Pattern 4: Regex Match

// Use regex for complex matching
whenGenerateText(p -> p.matches(".*\\d{4}-\\d{2}-\\d{2}.*"))
    .thenReturn("Date-based response");

Troubleshooting

Stub Not Being Used

Problem: Your code doesn't receive the stubbed response.

Solutions:

  1. Check that your predicate matches the actual prompt
  2. Verify stub is set up before code execution
  3. Ensure you're stubbing the correct operation type (text vs object)

Debug technique:

whenGenerateText(p -> {
    System.out.println("Actual prompt: " + p);
    return p.contains("expected");
}).thenReturn("response");

Multiple Stubs Conflicting

Problem: Multiple stubs match the same prompt.

Solution: Mockito uses the most recently defined stub that matches. Be specific with predicates:

// Bad: Both match
whenGenerateText(p -> p.contains("text")).thenReturn("A");
whenGenerateText(p -> true).thenReturn("B");

// Good: Specific conditions
whenGenerateText(p -> p.contains("specific text")).thenReturn("A");
whenGenerateText(p -> p.contains("other text")).thenReturn("B");

Wrong Object Type

Problem: Stub returns wrong type of object.

Solution: Ensure the class in whenCreateObject matches what your code expects:

// Correct
whenCreateObject(p -> true, Person.class).thenReturn(person);

// Wrong - will fail if code expects Person
whenCreateObject(p -> true, Employee.class).thenReturn(employee);

Best Practices

  1. Be Specific: Use specific predicates to avoid stub conflicts
  2. Keep It Simple: Don't over-complicate predicates
  3. Verify After: Always verify the stub was used (see Verification Guide)
  4. Test Errors: Include tests that stub exceptions
  5. Use Constants: Define common predicates as constants for reuse
// Reusable predicates
private static final Predicate<String> SUMMARIZE_PROMPT =
    p -> p.contains("summarize");

@Test
void testWithReusablePredicate() {
    whenGenerateText(SUMMARIZE_PROMPT).thenReturn("Summary");
    // ...
}

Next Steps

Related Topics

tessl i tessl/maven-com-embabel-agent--embabel-agent-test-support@0.3.0

docs

index.md

tile.json