CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-client-chat

Spring AI Chat Client provides a fluent API for building AI-powered applications with LLMs, supporting advisors, streaming, structured outputs, and conversation memory

Overview
Eval results
Files

response-handling.mddocs/reference/

Response Handling

The Spring AI Chat Client provides flexible response handling for both synchronous (blocking) and streaming (reactive) execution patterns. Responses can be accessed as raw text, converted to typed entities, or accessed as the full ChatResponse object.

Imports

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.ChatClientResponse;
import org.springframework.ai.chat.client.ResponseEntity;
import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.converter.StructuredOutputConverter;
import org.springframework.core.ParameterizedTypeReference;
import org.springframework.lang.Nullable;
import reactor.core.publisher.Flux;

Response Spec Interfaces

CallResponseSpec

Interface for processing synchronous (blocking) responses.

interface CallResponseSpec {
    @Nullable
    String content();
    @Nullable
    ChatResponse chatResponse();
    ChatClientResponse chatClientResponse();
    @Nullable
    <T> T entity(Class<T> type);
    @Nullable
    <T> T entity(ParameterizedTypeReference<T> type);
    @Nullable
    <T> T entity(StructuredOutputConverter<T> structuredOutputConverter);
    <T> ResponseEntity<ChatResponse, T> responseEntity(Class<T> type);
    <T> ResponseEntity<ChatResponse, T> responseEntity(
        ParameterizedTypeReference<T> type
    );
    <T> ResponseEntity<ChatResponse, T> responseEntity(
        StructuredOutputConverter<T> structuredOutputConverter
    );
}

StreamResponseSpec

Interface for processing streaming (reactive) responses.

interface StreamResponseSpec {
    Flux<String> content();
    Flux<ChatResponse> chatResponse();
    Flux<ChatClientResponse> chatClientResponse();
}

Synchronous Response Handling

Getting Text Content

The simplest way to get the response is as a String.

String content();

Example:

String answer = chatClient
    .prompt("What is Spring Framework?")
    .call()
    .content();

System.out.println(answer);

Getting ChatResponse

Access the full ChatResponse object containing metadata, model information, and usage statistics.

ChatResponse chatResponse();

Example:

import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.model.Generation;

ChatResponse response = chatClient
    .prompt("Explain Java")
    .call()
    .chatResponse();

// Access response details
Generation generation = response.getResult();
String content = generation.getOutput().getContent();
Map<String, Object> metadata = generation.getMetadata();

// Access usage information
var usage = response.getMetadata().getUsage();
Long promptTokens = usage.getPromptTokens();
Long generationTokens = usage.getGenerationTokens();
Long totalTokens = usage.getTotalTokens();

Converting to Typed Entities

Convert the response to a typed Java object. The AI model's output is parsed as JSON and mapped to the specified class.

With Class:

<T> T entity(Class<T> entityClass);

Example:

record Summary(String title, String content, List<String> tags) {}

Summary summary = chatClient
    .prompt("Summarize this article: " + article)
    .call()
    .entity(Summary.class);

System.out.println("Title: " + summary.title());
System.out.println("Tags: " + summary.tags());

With Generic Types:

<T> T entity(ParameterizedTypeReference<T> entityTypeRef);

Use ParameterizedTypeReference for generic types like List<T> or Map<K, V>.

Example:

import org.springframework.core.ParameterizedTypeReference;

List<String> items = chatClient
    .prompt("List 5 programming languages")
    .call()
    .entity(new ParameterizedTypeReference<List<String>>() {});

items.forEach(System.out::println);

Complex Example:

record Task(String name, String priority) {}

Map<String, List<Task>> tasksByProject = chatClient
    .prompt("Organize these tasks by project: " + tasks)
    .call()
    .entity(new ParameterizedTypeReference<Map<String, List<Task>>>() {});

Getting ResponseEntity

Get both the raw ChatResponse and the converted entity together.

<T> ResponseEntity<ChatResponse, T> responseEntity(Class<T> type);
<T> ResponseEntity<ChatResponse, T> responseEntity(
    ParameterizedTypeReference<T> type
);
<T> ResponseEntity<ChatResponse, T> responseEntity(
    StructuredOutputConverter<T> structuredOutputConverter
);

Example:

import org.springframework.ai.chat.client.ResponseEntity;

record Answer(String text) {}

ResponseEntity<ChatResponse, Answer> responseEntity = chatClient
    .prompt("What is 2+2?")
    .call()
    .responseEntity(Answer.class);

// Access both response and entity
ChatResponse chatResponse = responseEntity.getResponse();
Answer answer = responseEntity.getEntity();

// Get metadata from response
var usage = chatResponse.getMetadata().getUsage();
System.out.println("Tokens used: " + usage.getTotalTokens());

// Use the entity
System.out.println("Answer: " + answer.text());

Streaming Response Handling

Streaming responses use Project Reactor's Flux for reactive processing. Content is delivered incrementally as it's generated by the model.

Streaming Text Content

Flux<String> content();

Example:

Flux<String> stream = chatClient
    .prompt("Tell me a long story")
    .stream()
    .content();

// Print as content arrives
stream.subscribe(chunk -> System.out.print(chunk));

// Or collect all chunks
String complete = stream.collectList()
    .map(chunks -> String.join("", chunks))
    .block();

With Buffering:

Flux<String> stream = chatClient
    .prompt("Write an essay")
    .stream()
    .content();

stream
    .buffer(Duration.ofMillis(100)) // Buffer for 100ms
    .subscribe(chunks -> {
        String buffered = String.join("", chunks);
        System.out.print(buffered);
    });

Streaming ChatResponse

Stream ChatResponse objects as they arrive.

Flux<ChatResponse> chatResponse();

Example:

Flux<ChatResponse> stream = chatClient
    .prompt("Generate text")
    .stream()
    .chatResponse();

stream.subscribe(response -> {
    String content = response.getResult()
        .getOutput()
        .getContent();
    System.out.print(content);
});

Streaming ChatClientResponse

Stream ChatClientResponse objects containing both ChatResponse and context.

Flux<ChatClientResponse> chatClientResponse();

Example:

import org.springframework.ai.chat.client.ChatClientResponse;

Flux<ChatClientResponse> stream = chatClient
    .prompt("Generate a report")
    .stream()
    .chatClientResponse();

stream.subscribe(clientResponse -> {
    ChatResponse response = clientResponse.chatResponse();
    if (response != null) {
        String content = response.getResult()
            .getOutput()
            .getContent();
        System.out.print(content);
    }

    // Access context for advisor-shared data
    Map<String, Object> context = clientResponse.context();
});

Note: For structured output from streaming, you need to aggregate the stream first and then parse the complete response. The streaming API does not support incremental entity parsing.

Aggregating Streaming Responses

The ChatClientMessageAggregator utility helps aggregate streaming responses into complete messages.

class ChatClientMessageAggregator {
    Flux<ChatClientResponse> aggregateChatClientResponse(
        Flux<ChatClientResponse> chatClientResponses,
        Consumer<ChatClientResponse> aggregationHandler
    );
}

Example:

import org.springframework.ai.chat.client.ChatClientMessageAggregator;
import org.springframework.ai.chat.client.ChatClientResponse;

// Assuming you have access to the internal Flux<ChatClientResponse>
Flux<ChatClientResponse> stream = // ... get from internal API

ChatClientMessageAggregator aggregator = new ChatClientMessageAggregator();
Flux<ChatClientResponse> aggregated = aggregator
    .aggregateChatClientResponse(
        stream,
        completeResponse -> {
            // Called when streaming completes
            System.out.println("Complete response: " + completeResponse);
        }
    );

aggregated.subscribe(response -> {
    // Process each chunk
});

Handling Errors

Synchronous Error Handling

try {
    String response = chatClient
        .prompt("What is AI?")
        .call()
        .content();
} catch (Exception e) {
    System.err.println("Error: " + e.getMessage());
    // Handle error
}

Streaming Error Handling

Flux<String> stream = chatClient
    .prompt("Generate text")
    .stream()
    .content();

stream
    .doOnError(error -> {
        System.err.println("Stream error: " + error.getMessage());
    })
    .onErrorResume(error -> {
        // Provide fallback
        return Flux.just("Fallback response");
    })
    .subscribe(chunk -> System.out.print(chunk));

ResponseEntity Type

The ResponseEntity record wraps both the raw response and the converted entity.

record ResponseEntity<R, E>(
    @Nullable R response,
    @Nullable E entity
) {
    R getResponse();
    E getEntity();
}

Type Parameters:

  • R - Response type (typically ChatResponse)
  • E - Entity type (your custom class)

Example:

record Data(String value) {}

ResponseEntity<ChatResponse, Data> entity = chatClient
    .prompt("Get data")
    .call()
    .responseEntity(Data.class);

// Access response metadata
ChatResponse response = entity.getResponse();
if (response != null) {
    var usage = response.getMetadata().getUsage();
    System.out.println("Tokens: " + usage.getTotalTokens());
}

// Access converted entity
Data data = entity.getEntity();
System.out.println("Value: " + data.value());

Complete Examples

Synchronous with Structured Output

record WeatherInfo(
    String location,
    double temperature,
    String conditions,
    List<String> forecast
) {}

WeatherInfo weather = chatClient
    .prompt("What's the weather in Paris?")
    .call()
    .entity(WeatherInfo.class);

System.out.println(weather.location() + ": " + weather.temperature() + "°C");
weather.forecast().forEach(day -> System.out.println("  - " + day));

Streaming with Real-time Display

Flux<String> story = chatClient
    .prompt("Write a short story about a robot")
    .stream()
    .content();

// Print in real-time
story.subscribe(
    chunk -> System.out.print(chunk),
    error -> System.err.println("Error: " + error),
    () -> System.out.println("\n[Story complete]")
);

ResponseEntity with Metrics

record Analysis(String sentiment, double confidence, List<String> keywords) {}

ResponseEntity<ChatResponse, Analysis> result = chatClient
    .prompt("Analyze: " + text)
    .call()
    .responseEntity(Analysis.class);

// Log metrics
ChatResponse response = result.getResponse();
System.out.println("Tokens used: " +
    response.getMetadata().getUsage().getTotalTokens());

// Use analysis
Analysis analysis = result.getEntity();
System.out.println("Sentiment: " + analysis.sentiment() +
    " (confidence: " + analysis.confidence() + ")");

Streaming with Aggregation

import reactor.core.publisher.Mono;

Flux<String> stream = chatClient
    .prompt("Generate a report")
    .stream()
    .content();

// Collect all chunks into single string
Mono<String> complete = stream
    .collect(StringBuilder::new, StringBuilder::append)
    .map(StringBuilder::toString);

String fullReport = complete.block();
System.out.println(fullReport);

Install with Tessl CLI

npx tessl i tessl/maven-org-springframework-ai--spring-ai-client-chat@1.1.0

docs

index.md

tile.json