CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-ollama

Quarkus extension for integrating local Ollama language models with LangChain4j

Overview
Eval results
Files

index.mddocs/

Quarkus LangChain4j Ollama Extension

Quarkus extension for integrating local Ollama language models with LangChain4j, enabling AI-powered applications with chat models, streaming capabilities, embeddings, and function calling support.

Package Information

  • Package Name: quarkus-langchain4j-ollama
  • Group ID: io.quarkiverse.langchain4j
  • Artifact ID: quarkus-langchain4j-ollama
  • Package Type: Maven
  • Language: Java
  • Version: 1.7.4
  • Installation: Add to pom.xml:
<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-ollama</artifactId>
    <version>1.7.4</version>
</dependency>

Core Imports

import io.quarkiverse.langchain4j.ollama.*;
import io.quarkiverse.langchain4j.ollama.runtime.config.*;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.embedding.EmbeddingModel;

For declarative AI services:

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;

Basic Usage

Simple Chat Model via CDI

import jakarta.inject.Inject;
import dev.langchain4j.model.chat.ChatModel;

public class MyService {
    @Inject
    ChatModel chatModel;

    public String chat(String message) {
        return chatModel.chat(message);
    }
}

Configuration in application.properties:

quarkus.langchain4j.ollama.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.chat-model.temperature=0.7

Declarative AI Service

import io.quarkiverse.langchain4j.RegisterAiService;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;

@RegisterAiService
public interface ChatAssistant {
    @SystemMessage("You are a helpful assistant.")
    @UserMessage("Answer this question: {question}")
    String chat(String question);
}

// Usage
@Inject
ChatAssistant assistant;

String answer = assistant.chat("What is Quarkus?");

Embedding Model

import jakarta.inject.Inject;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.data.embedding.Embedding;

@Inject
EmbeddingModel embeddingModel;

Embedding embedding = embeddingModel.embed("Some text to embed").content();
float[] vector = embedding.vector();

Architecture

This extension integrates Ollama with Quarkus using the LangChain4j framework:

  • Model Providers: OllamaEmbeddingModel and OllamaStreamingChatLanguageModel implement LangChain4j interfaces
  • HTTP Client: OllamaClient and OllamaRestApi handle communication with Ollama server
  • CDI Integration: Automatic bean creation via OllamaRecorder for dependency injection
  • Configuration: Comprehensive SmallRye Config integration with named configurations
  • Data Models: Type-safe request/response objects with builder patterns
  • Tool Calling: Native support for function calling and tool execution

Capabilities

Chat Models

Chat models for conversational AI with support for both synchronous and streaming responses. Includes built-in function calling capabilities for tool execution.

// Via CDI injection
@Inject
ChatModel chatModel;

@Inject
StreamingChatModel streamingChatModel;

Chat Models

Embedding Models

Embedding models for generating vector representations of text, useful for semantic search, RAG (Retrieval-Augmented Generation), and similarity analysis.

// Via CDI injection
@Inject
EmbeddingModel embeddingModel;

// Programmatic API
class OllamaEmbeddingModel {
    static Builder builder();
    Response<List<Embedding>> embedAll(List<TextSegment> textSegments);
}

Embedding Models

Configuration

Comprehensive configuration system supporting default and named configurations, with extensive options for model behavior, timeouts, logging, and TLS.

Configuration prefix: quarkus.langchain4j.ollama

# Default configuration
quarkus.langchain4j.ollama.base-url=http://localhost:11434
quarkus.langchain4j.ollama.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.chat-model.temperature=0.8

# Named configuration
quarkus.langchain4j.ollama.my-model.chat-model.model-id=llama3.1
quarkus.langchain4j.ollama.my-model.chat-model.temperature=0.7

Configuration

Data Models

Type-safe data models for requests, responses, messages, and options. Includes support for tool calling, image inputs, and extensible message formats.

record ChatRequest(String model, List<Message> messages, List<Tool> tools, Options options, String format, Boolean stream) { }
record ChatResponse(String model, String createdAt, Message message, Boolean done, Integer promptEvalCount, Integer evalCount) { }
record Message(Role role, String content, List<ToolCall> toolCalls, List<String> images, Map<String, Object> additionalFields) { }
record Options(Double temperature, Integer topK, Double topP, Double repeatPenalty, Integer seed, Integer numPredict, Integer numCtx, List<String> stop) { }

Data Models

Tool Calling

Function calling support allowing models to invoke external tools and business logic. Enables agentic workflows and dynamic capability extension.

record Tool(Type type, Function function) { }
record ToolCall(FunctionCall function) { }

// Nested types
record Tool.Function(String name, String description, Parameters parameters) { }
record Tool.Function.Parameters(String type, Map<String, Map<String, Object>> properties, List<String> required) { }
record ToolCall.FunctionCall(String name, Map<String, Object> arguments) { }

Tool Calling

HTTP Client API

Low-level HTTP client for direct Ollama API access when CDI injection is not available or when fine-grained control is needed.

class OllamaClient {
    OllamaClient(String baseUrl, Duration timeout, boolean logRequests, boolean logResponses, boolean logCurl, String configName, String tlsConfigurationName);
    ChatResponse chat(ChatRequest request);
    Multi<ChatResponse> streamingChat(ChatRequest request);
    EmbeddingResponse embedding(EmbeddingRequest request);
}

HTTP Client API

Common Usage Patterns

Named Configurations

Use multiple model configurations within the same application:

# Default model
quarkus.langchain4j.ollama.chat-model.model-id=llama3.2

# Fast model for simple tasks
quarkus.langchain4j.ollama.fast.chat-model.model-id=llama3.2:1b
quarkus.langchain4j.ollama.fast.chat-model.temperature=0.5

# Creative model for content generation
quarkus.langchain4j.ollama.creative.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.creative.chat-model.temperature=1.2
@Inject
@Named("fast")
ChatModel fastModel;

@Inject
@Named("creative")
ChatModel creativeModel;

Streaming Responses

Stream responses for real-time user feedback:

import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;

@Inject
StreamingChatModel streamingModel;

ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("Tell me a story")))
    .build();

streamingModel.doChat(request, new StreamingChatResponseHandler() {
    @Override
    public void onPartialResponse(String token) {
        System.out.print(token);
    }

    @Override
    public void onCompleteResponse(ChatResponse response) {
        System.out.println("\n[Complete]");
    }

    @Override
    public void onError(Throwable error) {
        error.printStackTrace();
    }
});

Function Calling with AI Services

import dev.langchain4j.agent.tool.Tool;

public class WeatherTools {
    @Tool("Get current weather for a location")
    public String getWeather(String location) {
        return "Sunny, 72°F in " + location;
    }
}

@RegisterAiService(tools = WeatherTools.class)
public interface WeatherAssistant {
    String chat(String message);
}

// The model can automatically call getWeather() when needed
@Inject
WeatherAssistant assistant;
String response = assistant.chat("What's the weather in San Francisco?");

RAG with Embeddings

@Inject
EmbeddingModel embeddingModel;

// Embed documents
List<TextSegment> documents = List.of(
    TextSegment.from("Quarkus is a Java framework."),
    TextSegment.from("Ollama runs models locally.")
);
Response<List<Embedding>> embeddings = embeddingModel.embedAll(documents);

// Later, embed query and find similar documents
Embedding queryEmbedding = embeddingModel.embed("What is Quarkus?").content();
// ... compute similarity with document embeddings

Error Handling

All model operations may throw runtime exceptions:

  • Connection errors when Ollama server is unavailable
  • Timeout errors based on configured timeout
  • Model-specific errors (e.g., model not found, out of memory)

Handle appropriately:

try {
    String response = chatModel.chat("Hello");
} catch (Exception e) {
    logger.error("Failed to generate response", e);
    // Fallback logic
}

Requirements

  • Ollama server running locally or remotely (default: http://localhost:11434)
  • Required Ollama models downloaded (e.g., ollama pull llama3.2)
  • Quarkus 3.x or later
  • Java 17 or later

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-ollama@1.7.0

docs

chat-models.md

configuration.md

data-models.md

embedding-models.md

http-client.md

index.md

tool-calling.md

tile.json