CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-ollama

Quarkus extension for integrating local Ollama language models with LangChain4j

Overview
Eval results
Files

Quarkus LangChain4j Ollama Extension

Quarkus extension for integrating local Ollama language models with LangChain4j, enabling AI-powered applications with chat models, streaming capabilities, embeddings, and function calling support.

Package Information

  • Package Name: quarkus-langchain4j-ollama
  • Group ID: io.quarkiverse.langchain4j
  • Artifact ID: quarkus-langchain4j-ollama
  • Package Type: Maven
  • Language: Java
  • Version: 1.7.4
  • Installation: Add to pom.xml:
<dependency>
    <groupId>io.quarkiverse.langchain4j</groupId>
    <artifactId>quarkus-langchain4j-ollama</artifactId>
    <version>1.7.4</version>
</dependency>

Core Imports

import io.quarkiverse.langchain4j.ollama.*;
import io.quarkiverse.langchain4j.ollama.runtime.config.*;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.embedding.EmbeddingModel;

For declarative AI services:

import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import io.quarkiverse.langchain4j.RegisterAiService;

Basic Usage

Simple Chat Model via CDI

import jakarta.inject.Inject;
import dev.langchain4j.model.chat.ChatModel;

public class MyService {
    @Inject
    ChatModel chatModel;

    public String chat(String message) {
        return chatModel.chat(message);
    }
}

Configuration in application.properties:

quarkus.langchain4j.ollama.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.chat-model.temperature=0.7

Declarative AI Service

import io.quarkiverse.langchain4j.RegisterAiService;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;

@RegisterAiService
public interface ChatAssistant {
    @SystemMessage("You are a helpful assistant.")
    @UserMessage("Answer this question: {question}")
    String chat(String question);
}

// Usage
@Inject
ChatAssistant assistant;

String answer = assistant.chat("What is Quarkus?");

Embedding Model

import jakarta.inject.Inject;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.data.embedding.Embedding;

@Inject
EmbeddingModel embeddingModel;

Embedding embedding = embeddingModel.embed("Some text to embed").content();
float[] vector = embedding.vector();

Architecture

This extension integrates Ollama with Quarkus using the LangChain4j framework:

  • Model Providers: OllamaEmbeddingModel and OllamaStreamingChatLanguageModel implement LangChain4j interfaces
  • HTTP Client: OllamaClient and OllamaRestApi handle communication with Ollama server
  • CDI Integration: Automatic bean creation via OllamaRecorder for dependency injection
  • Configuration: Comprehensive SmallRye Config integration with named configurations
  • Data Models: Type-safe request/response objects with builder patterns
  • Tool Calling: Native support for function calling and tool execution

Capabilities

Chat Models

Chat models for conversational AI with support for both synchronous and streaming responses. Includes built-in function calling capabilities for tool execution.

// Via CDI injection
@Inject
ChatModel chatModel;

@Inject
StreamingChatModel streamingChatModel;

Chat Models

Embedding Models

Embedding models for generating vector representations of text, useful for semantic search, RAG (Retrieval-Augmented Generation), and similarity analysis.

// Via CDI injection
@Inject
EmbeddingModel embeddingModel;

// Programmatic API
class OllamaEmbeddingModel {
    static Builder builder();
    Response<List<Embedding>> embedAll(List<TextSegment> textSegments);
}

Embedding Models

Configuration

Comprehensive configuration system supporting default and named configurations, with extensive options for model behavior, timeouts, logging, and TLS.

Configuration prefix: quarkus.langchain4j.ollama

# Default configuration
quarkus.langchain4j.ollama.base-url=http://localhost:11434
quarkus.langchain4j.ollama.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.chat-model.temperature=0.8

# Named configuration
quarkus.langchain4j.ollama.my-model.chat-model.model-id=llama3.1
quarkus.langchain4j.ollama.my-model.chat-model.temperature=0.7

Configuration

Data Models

Type-safe data models for requests, responses, messages, and options. Includes support for tool calling, image inputs, and extensible message formats.

record ChatRequest(String model, List<Message> messages, List<Tool> tools, Options options, String format, Boolean stream) { }
record ChatResponse(String model, String createdAt, Message message, Boolean done, Integer promptEvalCount, Integer evalCount) { }
record Message(Role role, String content, List<ToolCall> toolCalls, List<String> images, Map<String, Object> additionalFields) { }
record Options(Double temperature, Integer topK, Double topP, Double repeatPenalty, Integer seed, Integer numPredict, Integer numCtx, List<String> stop) { }

Data Models

Tool Calling

Function calling support allowing models to invoke external tools and business logic. Enables agentic workflows and dynamic capability extension.

record Tool(Type type, Function function) { }
record ToolCall(FunctionCall function) { }

// Nested types
record Tool.Function(String name, String description, Parameters parameters) { }
record Tool.Function.Parameters(String type, Map<String, Map<String, Object>> properties, List<String> required) { }
record ToolCall.FunctionCall(String name, Map<String, Object> arguments) { }

Tool Calling

HTTP Client API

Low-level HTTP client for direct Ollama API access when CDI injection is not available or when fine-grained control is needed.

class OllamaClient {
    OllamaClient(String baseUrl, Duration timeout, boolean logRequests, boolean logResponses, boolean logCurl, String configName, String tlsConfigurationName);
    ChatResponse chat(ChatRequest request);
    Multi<ChatResponse> streamingChat(ChatRequest request);
    EmbeddingResponse embedding(EmbeddingRequest request);
}

HTTP Client API

Common Usage Patterns

Named Configurations

Use multiple model configurations within the same application:

# Default model
quarkus.langchain4j.ollama.chat-model.model-id=llama3.2

# Fast model for simple tasks
quarkus.langchain4j.ollama.fast.chat-model.model-id=llama3.2:1b
quarkus.langchain4j.ollama.fast.chat-model.temperature=0.5

# Creative model for content generation
quarkus.langchain4j.ollama.creative.chat-model.model-id=llama3.2
quarkus.langchain4j.ollama.creative.chat-model.temperature=1.2
@Inject
@Named("fast")
ChatModel fastModel;

@Inject
@Named("creative")
ChatModel creativeModel;

Streaming Responses

Stream responses for real-time user feedback:

import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;

@Inject
StreamingChatModel streamingModel;

ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("Tell me a story")))
    .build();

streamingModel.doChat(request, new StreamingChatResponseHandler() {
    @Override
    public void onPartialResponse(String token) {
        System.out.print(token);
    }

    @Override
    public void onCompleteResponse(ChatResponse response) {
        System.out.println("\n[Complete]");
    }

    @Override
    public void onError(Throwable error) {
        error.printStackTrace();
    }
});

Function Calling with AI Services

import dev.langchain4j.agent.tool.Tool;

public class WeatherTools {
    @Tool("Get current weather for a location")
    public String getWeather(String location) {
        return "Sunny, 72°F in " + location;
    }
}

@RegisterAiService(tools = WeatherTools.class)
public interface WeatherAssistant {
    String chat(String message);
}

// The model can automatically call getWeather() when needed
@Inject
WeatherAssistant assistant;
String response = assistant.chat("What's the weather in San Francisco?");

RAG with Embeddings

@Inject
EmbeddingModel embeddingModel;

// Embed documents
List<TextSegment> documents = List.of(
    TextSegment.from("Quarkus is a Java framework."),
    TextSegment.from("Ollama runs models locally.")
);
Response<List<Embedding>> embeddings = embeddingModel.embedAll(documents);

// Later, embed query and find similar documents
Embedding queryEmbedding = embeddingModel.embed("What is Quarkus?").content();
// ... compute similarity with document embeddings

Error Handling

All model operations may throw runtime exceptions:

  • Connection errors when Ollama server is unavailable
  • Timeout errors based on configured timeout
  • Model-specific errors (e.g., model not found, out of memory)

Handle appropriately:

try {
    String response = chatModel.chat("Hello");
} catch (Exception e) {
    logger.error("Failed to generate response", e);
    // Fallback logic
}

Requirements

  • Ollama server running locally or remotely (default: http://localhost:11434)
  • Required Ollama models downloaded (e.g., ollama pull llama3.2)
  • Quarkus 3.x or later
  • Java 17 or later

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-ollama@1.7.0
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/io.quarkiverse.langchain4j/quarkus-langchain4j-ollama@1.7.x