tessl/maven-dev-langchain4j--langchain4j-ollama

Java integration library enabling LangChain4j applications to use Ollama's local language models with support for chat, streaming, embeddings, and advanced reasoning features

Overview

Eval results

Files

Chat Models

Name: tessl/maven-dev-langchain4j--langchain4j-ollama
Author: tessl

Chat models provide conversational AI capabilities with full context management, supporting both synchronous and streaming interactions.

OllamaChatModel

Synchronous chat model for blocking request/response interactions.

Class Signature

package dev.langchain4j.model.ollama;

public class OllamaChatModel
    extends OllamaBaseChatModel
    implements ChatModel

Thread Safety: Immutable after build(); safe for concurrent requests from multiple threads

Nullability: Instance never null after successful build()

Creating an Instance

public static OllamaChatModel.OllamaChatModelBuilder builder()

Returns a builder for creating OllamaChatModel instances.

Returns: Fresh OllamaChatModelBuilder instance

Never null
Not thread-safe (each thread needs own builder)

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .baseUrl("http://localhost:11434")
    .modelName("llama2")
    .temperature(0.7)
    .maxRetries(3)
    .build();

Public Methods

doChat

public ChatResponse doChat(ChatRequest chatRequest)

Executes a synchronous chat request and returns the complete response.

Parameters:

chatRequest - The chat request containing messages and parameters (must not be null)

Returns: ChatResponse containing the AI's response message and metadata

Never null
Always contains at least one message
May contain thinking text if enabled

Throws:

IllegalArgumentException - If chatRequest is null
HttpTimeoutException - If request exceeds configured timeout
IOException - If network connectivity fails
RuntimeException - If Ollama server returns error (model not found, etc.)

Thread Safety: Safe for concurrent calls; no shared mutable state

Retry Behavior: Automatically retries on transient failures up to maxRetries times

Example:

import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;
import java.io.IOException;
import java.net.http.HttpTimeoutException;

try {
    ChatRequest request = ChatRequest.builder()
        .messages(UserMessage.from("Tell me a joke"))
        .build();

    ChatResponse response = model.doChat(request);
    System.out.println(response.aiMessage().text());

} catch (HttpTimeoutException e) {
    System.err.println("Request timed out: " + e.getMessage());
} catch (IOException e) {
    System.err.println("Network error: " + e.getMessage());
} catch (RuntimeException e) {
    System.err.println("Server error: " + e.getMessage());
}

defaultRequestParameters

public OllamaChatRequestParameters defaultRequestParameters()

Returns the default request parameters configured for this model.

Returns: OllamaChatRequestParameters - The default parameters

Never null
Immutable
Returns copy; modifications don't affect model

Example:

OllamaChatRequestParameters defaults = model.defaultRequestParameters();
Double temperature = defaults.temperature();  // May be null if not set

listeners

public List<ChatModelListener> listeners()

Returns the list of registered chat model listeners for observability.

Returns: List<ChatModelListener> - Registered listeners

Never null
May be empty
Immutable list

provider

public ModelProvider provider()

Returns the model provider identifier.

Returns: ModelProvider.OLLAMA - Always returns OLLAMA constant

supportedCapabilities

public Set<Capability> supportedCapabilities()

Returns the set of capabilities supported by this model.

Returns: Set<Capability> - Supported capabilities

Never null
May be empty if not configured
Immutable set

OllamaChatModel.OllamaChatModelBuilder

Builder for configuring and creating OllamaChatModel instances.

Class Signature

public static class OllamaChatModelBuilder
    extends OllamaBaseChatModel.Builder<OllamaChatModel, OllamaChatModelBuilder>

Thread Safety: Not thread-safe; each thread must use its own builder instance

Builder Methods

maxRetries

public OllamaChatModelBuilder maxRetries(Integer maxRetries)

Sets the maximum number of retry attempts for failed requests.

Parameters:

maxRetries - Maximum retry attempts
- Valid range: >= 0
- Default if not set: 2
- Null means use default

Returns: This builder instance (never null)

Throws:

IllegalArgumentException - If maxRetries < 0

Note: Retry only applies to transient failures (network errors, timeouts); server errors (model not found) are not retried

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .maxRetries(5)
    .build();

modelName

public OllamaChatModelBuilder modelName(String modelName)

Sets the name of the Ollama model to use.

Parameters:

modelName - Model name (e.g., "llama2", "mistral", "deepseek-r1")
- Must not be null or empty
- Must be available in Ollama server
- Can include tag (e.g., "llama2:13b")

Returns: This builder instance (never null)

Throws:

IllegalStateException at build() - If modelName not set
RuntimeException at runtime - If model not found on server

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .modelName("llama2:13b")
    .build();

temperature

public OllamaChatModelBuilder temperature(Double temperature)

Sets the sampling temperature for randomness control.

Parameters:

temperature - Temperature value
- Valid range: 0.0-2.0+ (higher values possible but not recommended)
- Default if not set: Model-specific default
- 0.0 = deterministic (with same seed)
- 0.7-0.9 = balanced
- > 1.0 = very creative/random
- Null means use model default

Returns: This builder instance (never null)

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .temperature(0.7)
    .build();

think

public OllamaChatModelBuilder think(Boolean think)

Controls thinking/reasoning mode for models like DeepSeek R1.

Parameters:

think - Thinking mode
- true: LLM thinks and returns thoughts in separate thinking field
- false: LLM does not think
- null (default): Reasoning LLMs prepend thoughts delimited by <think> tags
- Only effective for reasoning-capable models

Returns: This builder instance (never null)

See also: returnThinking() to control whether thinking text is returned

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .modelName("deepseek-r1")
    .think(true)
    .build();

returnThinking

public OllamaChatModelBuilder returnThinking(Boolean returnThinking)

Controls whether to return thinking/reasoning text in AiMessage.thinking() and invoke streaming callbacks.

Parameters:

returnThinking - Whether to parse and return thinking text
- Default if not set: false
- true: Thinking text returned in AiMessage.thinking()
- false: Thinking text not returned
- Null means use default

Returns: This builder instance (never null)

Note: This only controls whether to return thinking text; it does not enable thinking. Use think() to enable thinking.

Example:

OllamaChatModel model = OllamaChatModel.builder()
    .think(true)
    .returnThinking(true)
    .build();

ChatResponse response = model.doChat(request);
String thinking = response.aiMessage().thinking();  // Non-null if thinking occurred

build

public OllamaChatModel build()

Builds and returns the configured OllamaChatModel instance.

Returns: Configured OllamaChatModel

Never null
Immutable
Thread-safe

Throws:

IllegalStateException - If required parameters missing (e.g., modelName)
IllegalArgumentException - If parameter values invalid

Example:

try {
    OllamaChatModel model = OllamaChatModel.builder()
        .modelName("llama2")
        .build();
} catch (IllegalStateException e) {
    System.err.println("Missing required parameter: " + e.getMessage());
}

OllamaStreamingChatModel

Streaming chat model for real-time token-by-token responses.

Class Signature

package dev.langchain4j.model.ollama;

public class OllamaStreamingChatModel
    extends OllamaBaseChatModel
    implements StreamingChatModel

Thread Safety: Immutable after build(); safe for concurrent requests

Streaming Threading: Callbacks invoked on HTTP client thread; ensure thread-safe callback implementations

Creating an Instance

public static OllamaStreamingChatModel.OllamaStreamingChatModelBuilder builder()

Returns a builder for creating OllamaStreamingChatModel instances.

Returns: Fresh OllamaStreamingChatModelBuilder instance (not thread-safe)

Example:

OllamaStreamingChatModel model = OllamaStreamingChatModel.builder()
    .baseUrl("http://localhost:11434")
    .modelName("llama2")
    .build();

Public Methods

doChat

public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler)

Executes a streaming chat request, invoking the handler as tokens arrive.

Parameters:

chatRequest - The chat request containing messages and parameters (must not be null)
handler - Handler for streaming response callbacks (must not be null)

Returns: void - Method returns immediately; response arrives via callbacks

Throws:

IllegalArgumentException - If chatRequest or handler is null

Error Handling: Errors during streaming trigger handler.onError(Throwable)

Thread Safety:

Safe for concurrent calls
Handler callbacks invoked on HTTP client thread
Ensure handler implementation is thread-safe if shared

No Retry: Streaming operations do not automatically retry on failure

Example:

import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.model.chat.response.ChatResponse;

model.doChat(request, new StreamingChatResponseHandler() {
    @Override
    public void onPartialResponse(String partialResponse) {
        System.out.print(partialResponse);  // Must be thread-safe
    }

    @Override
    public void onPartialThinking(PartialThinking thinking) {
        // Called when thinking text arrives (if returnThinking=true)
        System.out.print("[thinking] " + thinking.text());
    }

    @Override
    public void onCompleteResponse(ChatResponse response) {
        System.out.println("\n[Done]");
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
});

Usage Examples

Basic Synchronous Chat

import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;
import java.io.IOException;

OllamaChatModel model = OllamaChatModel.builder()
    .baseUrl("http://localhost:11434")
    .modelName("llama2")
    .temperature(0.7)
    .build();

try {
    ChatRequest request = ChatRequest.builder()
        .messages(UserMessage.from("What is 2+2?"))
        .build();

    ChatResponse response = model.doChat(request);
    System.out.println(response.aiMessage().text());

} catch (IOException e) {
    System.err.println("Network error: " + e.getMessage());
}

Multi-turn Conversation

import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.AiMessage;
import java.util.ArrayList;
import java.util.List;

OllamaChatModel model = OllamaChatModel.builder()
    .modelName("llama2")
    .build();

List<ChatMessage> history = new ArrayList<>();

// First turn
history.add(UserMessage.from("Hi, I'm learning Java."));
ChatRequest request1 = ChatRequest.builder().messages(history).build();
ChatResponse response1 = model.doChat(request1);
history.add(response1.aiMessage());

System.out.println("AI: " + response1.aiMessage().text());

// Second turn
history.add(UserMessage.from("Can you recommend a good book?"));
ChatRequest request2 = ChatRequest.builder().messages(history).build();
ChatResponse response2 = model.doChat(request2);
history.add(response2.aiMessage());

System.out.println("AI: " + response2.aiMessage().text());

Note: Models are stateless; caller must maintain conversation history

Best Practices

1. Choose the Right Model Type

// For simple request/response
OllamaChatModel model = OllamaChatModel.builder().build();

// For real-time user interfaces
OllamaStreamingChatModel streamingModel = OllamaStreamingChatModel.builder().build();

2. Configure Retry Logic

OllamaChatModel model = OllamaChatModel.builder()
    .maxRetries(3)  // For production, use 3-5 retries
    .timeout(Duration.ofMinutes(2))
    .build();

3. Implement Proper Error Handling

import java.io.IOException;
import java.net.http.HttpTimeoutException;

try {
    ChatResponse response = model.doChat(request);
    // Process response
} catch (HttpTimeoutException e) {
    logger.error("Request timed out", e);
    // Retry or fallback
} catch (IOException e) {
    logger.error("Network error", e);
    // Check connectivity
} catch (RuntimeException e) {
    logger.error("Server error", e);
    // Check model availability
}

4. Thread-Safe Handler Implementation

// Handler must be thread-safe if shared
class ThreadSafeHandler implements StreamingChatResponseHandler {
    private final AtomicReference<String> response = new AtomicReference<>("");

    @Override
    public synchronized void onPartialResponse(String partial) {
        // Synchronized for thread safety
        response.updateAndGet(current -> current + partial);
    }

    // Other methods...
}

tessl/maven-dev-langchain4j--langchain4j-ollama

chat-models.mddocs/

Chat Models

OllamaChatModel

Class Signature

Creating an Instance

Public Methods

doChat

defaultRequestParameters

listeners

provider

supportedCapabilities

OllamaChatModel.OllamaChatModelBuilder

Class Signature

Builder Methods

maxRetries

modelName

temperature

think

returnThinking

build

OllamaStreamingChatModel

Class Signature

Creating an Instance

Public Methods

doChat

Usage Examples

Basic Synchronous Chat

Multi-turn Conversation

Best Practices

1. Choose the Right Model Type

2. Configure Retry Logic

3. Implement Proper Error Handling

4. Thread-Safe Handler Implementation

See Also

tessl/maven-dev-langchain4j--langchain4j-ollama

chat-models.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Chat Models

OllamaChatModel

Class Signature

Creating an Instance

Public Methods

doChat

defaultRequestParameters

listeners

provider

supportedCapabilities

OllamaChatModel.OllamaChatModelBuilder

Class Signature

Builder Methods

maxRetries

modelName

temperature

think

returnThinking

build

OllamaStreamingChatModel

Class Signature

Creating an Instance

Public Methods

doChat

Usage Examples

Basic Synchronous Chat

Multi-turn Conversation

Best Practices

1. Choose the Right Model Type

2. Configure Retry Logic

3. Implement Proper Error Handling

4. Thread-Safe Handler Implementation

See Also

chat-models.mddocs/