Spring Boot-compatible Ollama integration providing ChatModel and EmbeddingModel implementations for running large language models locally with support for streaming, tool calling, model management, and observability.
Enable models to show their reasoning process before providing answers.
Thinking/reasoning models (like Qwen3, DeepSeek, GPT-OSS) can emit their internal reasoning traces before generating a final answer. This improves answer quality for complex problems and provides transparency into the model's thought process.
Thinking capabilities are available in specific models:
Controls thinking behavior with two implementations.
package org.springframework.ai.ollama.api;
public sealed interface ThinkOption {
Object toJsonValue();
}Boolean enable/disable for most thinking models.
public record ThinkBoolean(boolean enabled) implements ThinkOptionConstants:
ThinkBoolean.ENABLED - Enable thinkingThinkBoolean.DISABLED - Disable thinkingSupported by:
String-level control for GPT-OSS model.
public record ThinkLevel(String level) implements ThinkOptionConstants:
ThinkLevel.LOW - Low thinking intensityThinkLevel.MEDIUM - Medium thinking intensityThinkLevel.HIGH - High thinking intensitySupported by:
Valid values: "low", "medium", "high"
For Qwen 3, DeepSeek models:
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.enableThinking() // Enable reasoning traces
.build();
ChatResponse response = chatModel.call(
new Prompt("Solve: If a car travels 60 miles in 90 minutes, what is its speed in mph?", options)
);
// Response includes thinking processOllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.disableThinking() // Disable for this request
.build();For GPT-OSS model:
// Low intensity
OllamaChatOptions options = OllamaChatOptions.builder()
.model("gpt-oss")
.thinkLow()
.build();
// Medium intensity
OllamaChatOptions options = OllamaChatOptions.builder()
.model("gpt-oss")
.thinkMedium()
.build();
// High intensity
OllamaChatOptions options = OllamaChatOptions.builder()
.model("gpt-oss")
.thinkHigh()
.build();The thinking process is included in the response message:
ChatResponse response = chatModel.call(new Prompt(complexQuestion, options));
// Get the main response
String answer = response.getResult().getOutput().getText();
// Access thinking trace (if available)
AssistantMessage message = response.getResult().getOutput();
String thinkingTrace = message.getMetadata().get("thinking");
if (thinkingTrace != null) {
System.out.println("Model's reasoning:");
System.out.println(thinkingTrace);
System.out.println("\nFinal answer:");
System.out.println(answer);
}Using OllamaApi directly:
ChatRequest request = ChatRequest.builder("qwen3:4b-thinking")
.messages(List.of(
Message.builder(Role.USER)
.content("What is 15% of 240?")
.build()
))
.enableThinking()
.build();
ChatResponse response = ollamaApi.chat(request);
// Thinking is in the message
String thinking = response.message().thinking();
String content = response.message().content();
System.out.println("Reasoning: " + thinking);
System.out.println("Answer: " + content);Enable thinking mode (boolean true).
public Builder enableThinking()Returns: Builder
Example:
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.enableThinking()
.build();Disable thinking mode (boolean false).
public Builder disableThinking()Returns: Builder
Example:
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.disableThinking()
.build();Set thinking level to "low" (GPT-OSS only).
public Builder thinkLow()Returns: Builder
Set thinking level to "medium" (GPT-OSS only).
public Builder thinkMedium()Returns: Builder
Set thinking level to "high" (GPT-OSS only).
public Builder thinkHigh()Returns: Builder
Set think option explicitly.
public Builder thinkOption(ThinkOption thinkOption)Parameters:
thinkOption (ThinkOption): ThinkBoolean or ThinkLevel instanceReturns: Builder
Example:
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.thinkOption(ThinkOption.ThinkBoolean.ENABLED)
.build();Same methods available on ChatRequest.Builder:
ChatRequest request = ChatRequest.builder("qwen3:4b-thinking")
.messages(messages)
.enableThinking() // or .disableThinking(), .thinkLow(), etc.
.build();In Ollama 0.12+, thinking-capable models auto-enable thinking:
// These models auto-enable thinking by default:
// - qwen3:*-thinking
// - deepseek-r1
// - deepseek-v3.1
// No need to call enableThinking() - already enabled
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.build();
// Thinking is enabled by defaultStandard models (llama3, mistral, etc.) don't support thinking:
// This has no effect - model doesn't support thinking
OllamaChatOptions options = OllamaChatOptions.builder()
.model("llama3")
.enableThinking() // Ignored
.build();@Service
public class MathSolver {
private final OllamaChatModel thinkingModel;
public MathSolver(OllamaApi ollamaApi) {
this.thinkingModel = OllamaChatModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.enableThinking()
.temperature(0.2) // Lower temp for math
.build())
.build();
}
public record Solution(String reasoning, String answer) {}
public Solution solve(String problem) {
ChatResponse response = thinkingModel.call(new Prompt(problem));
AssistantMessage message = response.getResult().getOutput();
String reasoning = message.getMetadata().get("thinking");
String answer = message.getText();
return new Solution(reasoning, answer);
}
}
// Usage
MathSolver solver = new MathSolver(ollamaApi);
Solution solution = solver.solve(
"A train leaves Chicago at 3pm traveling at 60mph. " +
"Another train leaves New York (800 miles away) at 4pm traveling at 70mph. " +
"When do they meet?"
);
System.out.println("Reasoning:");
System.out.println(solution.reasoning());
System.out.println("\nAnswer:");
System.out.println(solution.answer());public class ModelComparison {
public void compareModels(String question) {
OllamaApi ollamaApi = OllamaApi.builder().build();
// Standard model
OllamaChatModel standardModel = OllamaChatModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(OllamaChatOptions.builder()
.model("qwen3:4b") // Non-thinking version
.build())
.build();
// Thinking model
OllamaChatModel thinkingModel = OllamaChatModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.enableThinking()
.build())
.build();
// Compare responses
Prompt prompt = new Prompt(question);
System.out.println("=== Standard Model ===");
ChatResponse standardResponse = standardModel.call(prompt);
System.out.println(standardResponse.getResult().getOutput().getText());
System.out.println("\n=== Thinking Model ===");
ChatResponse thinkingResponse = thinkingModel.call(prompt);
AssistantMessage message = thinkingResponse.getResult().getOutput();
String reasoning = message.getMetadata().get("thinking");
if (reasoning != null) {
System.out.println("Reasoning: " + reasoning);
}
System.out.println("Answer: " + message.getText());
}
}public class AdaptiveChat {
private final OllamaChatModel chatModel;
public AdaptiveChat(OllamaApi ollamaApi) {
this.chatModel = OllamaChatModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.build())
.build();
}
public String chat(String message, boolean needsReasoning) {
OllamaChatOptions options = OllamaChatOptions.builder()
.thinkOption(needsReasoning
? ThinkOption.ThinkBoolean.ENABLED
: ThinkOption.ThinkBoolean.DISABLED)
.build();
ChatResponse response = chatModel.call(new Prompt(message, options));
return response.getResult().getOutput().getText();
}
public String complexQuestion(String question) {
return chat(question, true); // Enable thinking
}
public String simpleQuestion(String question) {
return chat(question, false); // Disable thinking for speed
}
}
// Usage
AdaptiveChat chat = new AdaptiveChat(ollamaApi);
// Simple question - fast response without thinking
String greeting = chat.simpleQuestion("Hello, how are you?");
// Complex question - detailed reasoning
String solution = chat.complexQuestion(
"If I invest $10,000 at 5% annual interest, " +
"compounded quarterly, how much will I have after 3 years?"
);public class GPTOSSThinking {
private final OllamaApi ollamaApi;
public GPTOSSThinking(OllamaApi ollamaApi) {
this.ollamaApi = ollamaApi;
}
public String solveWithLevel(String problem, ThinkOption.ThinkLevel level) {
OllamaChatModel model = OllamaChatModel.builder()
.ollamaApi(ollamaApi)
.defaultOptions(OllamaChatOptions.builder()
.model("gpt-oss")
.thinkOption(level)
.build())
.build();
ChatResponse response = model.call(new Prompt(problem));
return response.getResult().getOutput().getText();
}
public void compareThinkingLevels(String problem) {
System.out.println("=== Low Thinking ===");
System.out.println(solveWithLevel(problem, ThinkOption.ThinkLevel.LOW));
System.out.println("\n=== Medium Thinking ===");
System.out.println(solveWithLevel(problem, ThinkOption.ThinkLevel.MEDIUM));
System.out.println("\n=== High Thinking ===");
System.out.println(solveWithLevel(problem, ThinkOption.ThinkLevel.HIGH));
}
}Enable thinking for math, logic, and reasoning problems:
OllamaChatOptions options = OllamaChatOptions.builder()
.model("qwen3:4b-thinking")
.enableThinking()
.build();
// Math problems
// Logic puzzles
// Multi-step reasoning
// Strategy planningUse thinking for code review and debugging:
String codeReview = """
Review this code and explain any issues:
public int divide(int a, int b) {
return a / b;
}
""";
// Model will reason about edge cases, errors, etc.Help users understand decision reasoning:
String decision = """
I need to choose between two job offers:
- Job A: $100k salary, 2 weeks vacation, close to home
- Job B: $120k salary, 3 weeks vacation, 1 hour commute
What factors should I consider?
""";
// Model shows reasoning about trade-offsShow step-by-step problem solving:
String teaching = """
Explain how to solve this algebra problem step by step:
Solve for x: 3x + 5 = 2x + 12
""";
// Thinking trace shows each step of the solutiontessl i tessl/maven-org-springframework-ai--spring-ai-ollama@1.1.1