CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-watsonx

Quarkus extension for integrating IBM watsonx.ai foundation models with LangChain4j. Provides chat models, generation models, streaming models, embedding models, and scoring models for IBM watsonx.ai. Includes comprehensive configuration options, support for tool/function calling, text extraction from documents in Cloud Object Storage, and experimental built-in services for Google search, weather, and web crawling. Designed for enterprise Java applications using the Quarkus framework with built-in dependency injection and native compilation support.

Overview
Eval results
Files

chat-models.mddocs/

Chat Models

Modern chat-based text generation models with support for tool/function calling, streaming responses, JSON schema outputs, and multi-turn conversations. These models implement LangChain4j's ChatModel and StreamingChatModel interfaces for seamless integration with the LangChain4j ecosystem.

Capabilities

Synchronous Chat Model

Execute synchronous chat completions with full support for tools, system messages, user messages, and structured outputs.

public class WatsonxChatModel implements dev.langchain4j.model.chat.ChatModel {
    public static Builder builder();
    public ChatResponse doChat(ChatRequest chatRequest);
    public List<ChatModelListener> listeners();
    public ChatRequestParameters defaultRequestParameters();
    public Set<Capability> supportedCapabilities();
    public WatsonxRestApi getClient();
    public String getModelId();
    public String getProjectId();
    public String getSpaceId();
    public String getVersion();
}

Supported Capabilities:

  • Capability.JSON_SCHEMA_RESPONSE_FORMAT - Structured JSON outputs with schema validation

Example Usage:

import io.quarkiverse.langchain4j.watsonx.WatsonxChatModel;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.model.chat.response.ChatResponse;
import java.net.URL;

// Build model
WatsonxChatModel model = WatsonxChatModel.builder()
    .modelId("meta-llama/llama-4-maverick-17b-128e-instruct-fp8")
    .url(new URL("https://us-south.ml.cloud.ibm.com"))
    .projectId("your-project-id")
    .tokenGenerator(tokenGenerator)
    .temperature(0.7)
    .maxTokens(2048)
    .build();

// Single-turn conversation
ChatResponse response = model.chat(UserMessage.from("What is the capital of France?"));
String answer = response.aiMessage().text();

// Multi-turn conversation with system message
ChatResponse response2 = model.chat(
    SystemMessage.from("You are a helpful geography tutor"),
    UserMessage.from("What is the capital of France?")
);

Multi-turn Conversation Example:

import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.data.message.ChatMessage;
import java.util.ArrayList;
import java.util.List;

List<ChatMessage> messages = new ArrayList<>();
messages.add(SystemMessage.from("You are a helpful assistant"));
messages.add(UserMessage.from("What is 2+2?"));

ChatResponse response1 = model.chat(messages);
messages.add(response1.aiMessage());

// Continue conversation
messages.add(UserMessage.from("What about 3+3?"));
ChatResponse response2 = model.chat(messages);

Streaming Chat Model

Stream chat responses in real-time for improved user experience with long-form content generation.

public class WatsonxStreamingChatModel implements dev.langchain4j.model.chat.StreamingChatModel {
    public static Builder builder();
    public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler);
    public List<ChatModelListener> listeners();
    public ChatRequestParameters defaultRequestParameters();
    public Set<Capability> supportedCapabilities();
    public WatsonxRestApi getClient();
    public String getModelId();
    public String getProjectId();
    public String getSpaceId();
    public String getVersion();
}

Example Usage:

import io.quarkiverse.langchain4j.watsonx.WatsonxStreamingChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.output.StreamingResponseHandler;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;

WatsonxStreamingChatModel streamingModel = WatsonxStreamingChatModel.builder()
    .modelId("meta-llama/llama-4-maverick-17b-128e-instruct-fp8")
    .url(new URL("https://us-south.ml.cloud.ibm.com"))
    .projectId("your-project-id")
    .tokenGenerator(tokenGenerator)
    .temperature(0.7)
    .maxTokens(2048)
    .build();

streamingModel.chat(
    ChatRequest.builder()
        .messages(List.of(UserMessage.from("Write a short story about a robot")))
        .build(),
    new StreamingChatResponseHandler() {
        private final StringBuilder fullResponse = new StringBuilder();

        @Override
        public void onPartialResponse(String partialResponse) {
            fullResponse.append(partialResponse);
            System.out.print(partialResponse);
        }

        @Override
        public void onComplete(ChatResponse response) {
            System.out.println("\nTokens used: " + response.tokenUsage().totalTokens());
        }

        @Override
        public void onError(Throwable error) {
            System.err.println("Error: " + error.getMessage());
        }
    }
);

Chat Model Builder

Configure chat models with extensive customization options.

public static class Builder extends Watsonx.Builder<WatsonxChatModel, Builder> {
    // Inherited base parameters
    public Builder modelId(String modelId);
    public Builder version(String version);
    public Builder spaceId(String spaceId);
    public Builder projectId(String projectId);
    public Builder url(URL url);
    public Builder timeout(Duration timeout);
    public Builder tokenGenerator(TokenGenerator tokenGenerator);
    public Builder logRequests(boolean logRequests);
    public Builder logResponses(boolean logResponses);
    public Builder logCurl(boolean logCurl);
    public Builder listeners(List<ChatModelListener> listeners);

    // Chat-specific parameters
    public Builder toolChoice(ToolChoice toolChoice);
    public Builder toolChoiceName(String toolChoiceName);
    public Builder frequencyPenalty(Double frequencyPenalty);
    public Builder logprobs(Boolean logprobs);
    public Builder topLogprobs(Integer topLogprobs);
    public Builder maxTokens(Integer maxTokens);
    public Builder n(Integer n);
    public Builder presencePenalty(Double presencePenalty);
    public Builder seed(Integer seed);
    public Builder stop(List<String> stop);
    public Builder temperature(Double temperature);
    public Builder topP(Double topP);
    public Builder responseFormat(ResponseFormat responseFormat);
    public Builder responseFormatText(String responseFormatText);

    public WatsonxChatModel build();
}

Parameter Details:

  • modelId (String, required): Watsonx model identifier

    • Default: "meta-llama/llama-4-maverick-17b-128e-instruct-fp8"
    • Examples: "ibm/granite-13b-chat-v2", "meta-llama/llama-3-70b-instruct"
  • version (String): API version

    • Default: "2025-04-23"
    • Format: "YYYY-MM-DD"
  • spaceId (String): Deployment space ID (mutually exclusive with projectId)

  • projectId (String): Project ID (mutually exclusive with spaceId)

  • url (URL, required): Watsonx API base URL

    • US South: https://us-south.ml.cloud.ibm.com
    • EU Germany: https://eu-de.ml.cloud.ibm.com
    • Japan: https://jp-tok.ml.cloud.ibm.com
  • timeout (Duration): Request timeout

    • Default: 10 seconds
  • tokenGenerator (TokenGenerator, required): Handles IBM Cloud IAM authentication

  • logRequests (boolean): Enable request body logging

    • Default: false
  • logResponses (boolean): Enable response body logging

    • Default: false
  • logCurl (boolean): Log requests as cURL commands

    • Default: false
  • listeners (List<ChatModelListener>): Request/response event listeners

  • toolChoice (ToolChoice): Tool selection strategy

    • Values: ToolChoice.AUTO, ToolChoice.REQUIRED
    • AUTO: Model decides whether to use tools
    • REQUIRED: Model must use at least one tool
  • toolChoiceName (String): Specific tool name to call

    • Forces model to use the named tool
  • frequencyPenalty (Double): Penalize frequent tokens

    • Range: -2.0 to 2.0
    • Default: 0.0
    • Positive values reduce repetition
  • logprobs (Boolean): Return log probabilities

    • Default: false
    • Useful for analyzing model confidence
  • topLogprobs (Integer): Number of top log probabilities to return

    • Range: 0 to 20
    • Requires logprobs=true
  • maxTokens (Integer): Maximum tokens to generate

    • Default: 1024
    • Includes prompt tokens in some models
  • n (Integer): Number of completions to generate

    • Default: 1
    • Generates multiple independent responses
  • presencePenalty (Double): Penalize tokens that have appeared

    • Range: -2.0 to 2.0
    • Default: 0.0
    • Positive values encourage topic diversity
  • seed (Integer): Random seed for reproducibility

    • Makes generations deterministic
  • stop (List<String>): Stop sequences (max 4)

    • Generation stops when any sequence is encountered
  • temperature (Double): Sampling temperature

    • Range: 0.0 to 2.0
    • Default: 1.0
    • Lower = more focused, Higher = more creative
  • topP (Double): Nucleus sampling parameter

    • Range: 0.0 to 1.0
    • Default: 1.0
    • Lower values = more focused sampling
  • responseFormat (ResponseFormat): Structured output format

    • ResponseFormat.TEXT: Plain text response
    • ResponseFormat.JSON: JSON object response
    • ResponseFormat.jsonSchema(schema): JSON with schema validation
  • responseFormatText (String): Response format as text

    • Values: "text", "json_object", "json_schema"

Builder Example:

WatsonxChatModel model = WatsonxChatModel.builder()
    .modelId("meta-llama/llama-4-maverick-17b-128e-instruct-fp8")
    .url(new URL("https://us-south.ml.cloud.ibm.com"))
    .projectId("abc123")
    .tokenGenerator(tokenGenerator)
    .temperature(0.8)
    .maxTokens(2048)
    .frequencyPenalty(0.5)
    .presencePenalty(0.3)
    .stop(List.of("\n\n", "END"))
    .seed(42)
    .toolChoice(ToolChoice.AUTO)
    .logRequests(true)
    .logResponses(true)
    .build();

Tool/Function Calling

Enable models to call external functions/tools for extended capabilities.

Defining Tools:

import dev.langchain4j.agent.tool.Tool;

public class WeatherTools {
    @Tool("Get current weather for a city")
    public String getCurrentWeather(
        @P("City name") String city,
        @P("Country code") String country
    ) {
        // Implementation
        return "Temperature: 22°C, Condition: Sunny";
    }

    @Tool("Get weather forecast for next 5 days")
    public String getForecast(
        @P("City name") String city
    ) {
        // Implementation
        return "5-day forecast data...";
    }
}

Using Tools with ChatModel:

import dev.langchain4j.service.AiServices;

interface WeatherAssistant {
    String chat(String userMessage);
}

WeatherAssistant assistant = AiServices.builder(WeatherAssistant.class)
    .chatModel(chatModel)
    .tools(new WeatherTools())
    .build();

String response = assistant.chat("What's the weather in Paris?");
// Model will automatically call getCurrentWeather("Paris", "FR") and use result

Manual Tool Specification:

import dev.langchain4j.agent.tool.ToolSpecification;
import dev.langchain4j.agent.tool.ToolExecutionRequest;
import dev.langchain4j.data.message.ToolExecutionResultMessage;

ToolSpecification weatherTool = ToolSpecification.builder()
    .name("get_weather")
    .description("Get current weather for a city")
    .addParameter("city", "string", "City name")
    .addParameter("country", "string", "Country code")
    .build();

ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("What's the weather in Paris?")))
    .toolSpecifications(List.of(weatherTool))
    .build();

ChatResponse response = chatModel.chat(request);

// Check if model wants to call tool
if (response.aiMessage().hasToolExecutionRequests()) {
    for (ToolExecutionRequest toolRequest : response.aiMessage().toolExecutionRequests()) {
        // Execute tool
        String result = executeWeatherTool(
            toolRequest.arguments().get("city"),
            toolRequest.arguments().get("country")
        );

        // Send result back to model
        messages.add(response.aiMessage());
        messages.add(ToolExecutionResultMessage.from(toolRequest, result));

        ChatResponse finalResponse = chatModel.chat(messages);
    }
}

Structured JSON Outputs

Generate JSON responses with schema validation.

import dev.langchain4j.model.chat.request.ResponseFormat;
import dev.langchain4j.model.chat.request.json.JsonSchema;

// Define JSON schema
String schemaJson = """
{
  "type": "object",
  "properties": {
    "name": {"type": "string"},
    "age": {"type": "integer"},
    "email": {"type": "string", "format": "email"}
  },
  "required": ["name", "age"],
  "additionalProperties": false
}
""";

JsonSchema schema = JsonSchema.builder()
    .name("user_profile")
    .schema(schemaJson)
    .build();

// Use schema with model
WatsonxChatModel model = WatsonxChatModel.builder()
    .modelId("meta-llama/llama-4-maverick-17b-128e-instruct-fp8")
    .url(url)
    .projectId(projectId)
    .tokenGenerator(tokenGenerator)
    .responseFormat(ResponseFormat.jsonSchema(schema))
    .build();

ChatResponse response = model.chat(
    UserMessage.from("Extract profile: John Doe is 30 years old, email john@example.com")
);

String jsonOutput = response.aiMessage().text();
// Output: {"name": "John Doe", "age": 30, "email": "john@example.com"}

Simple JSON Object Mode:

WatsonxChatModel model = WatsonxChatModel.builder()
    .responseFormat(ResponseFormat.JSON)
    .build();

ChatResponse response = model.chat(
    UserMessage.from("Return a JSON object with fields: title, author, year")
);
// Output will be valid JSON object

Chat Model Listeners

Monitor and log chat model requests and responses.

import dev.langchain4j.model.chat.listener.ChatModelListener;
import dev.langchain4j.model.chat.listener.ChatModelRequest;
import dev.langchain4j.model.chat.listener.ChatModelResponse;

public class LoggingChatListener implements ChatModelListener {
    @Override
    public void onRequest(ChatModelRequest request) {
        System.out.println("Request: " + request.messages().size() + " messages");
        System.out.println("Model: " + request.model());
    }

    @Override
    public void onResponse(ChatModelResponse response) {
        System.out.println("Response: " + response.aiMessage().text());
        System.out.println("Tokens: " + response.tokenUsage());
        System.out.println("Finish reason: " + response.finishReason());
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
}

// Add listener to model
WatsonxChatModel model = WatsonxChatModel.builder()
    .listeners(List.of(new LoggingChatListener()))
    .build();

Dependency Injection with Quarkus

Use Quarkus CDI for automatic model creation and injection.

Configuration:

quarkus.langchain4j.watsonx.base-url=https://us-south.ml.cloud.ibm.com
quarkus.langchain4j.watsonx.api-key=your-api-key
quarkus.langchain4j.watsonx.project-id=your-project-id

# Chat model configuration
quarkus.langchain4j.watsonx.chat-model.model-name=meta-llama/llama-4-maverick-17b-128e-instruct-fp8
quarkus.langchain4j.watsonx.chat-model.temperature=0.7
quarkus.langchain4j.watsonx.chat-model.max-tokens=2048
quarkus.langchain4j.watsonx.chat-model.tool-choice=auto
quarkus.langchain4j.watsonx.chat-model.frequency-penalty=0.5
quarkus.langchain4j.watsonx.chat-model.presence-penalty=0.3

Injection:

import jakarta.inject.Inject;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.StreamingChatModel;

@ApplicationScoped
public class ChatService {
    @Inject
    ChatModel chatModel;

    @Inject
    StreamingChatModel streamingChatModel;

    public String askQuestion(String question) {
        return chatModel.chat(UserMessage.from(question))
            .aiMessage()
            .text();
    }
}

Types

Watsonx Base Class

public abstract class Watsonx {
    public WatsonxRestApi getClient();
    public String getModelId();
    public String getProjectId();
    public String getSpaceId();
    public String getVersion();

    public static abstract class Builder<M extends Watsonx, B extends Builder<M, B>> {
        public B modelId(String modelId);
        public B version(String version);
        public B spaceId(String spaceId);
        public B projectId(String projectId);
        public B url(URL url);
        public B timeout(Duration timeout);
        public B tokenGenerator(TokenGenerator tokenGenerator);
        public B logRequests(boolean logRequests);
        public B logResponses(boolean logResponses);
        public B logCurl(boolean logCurl);
        public abstract M build();
    }
}

Token Generator

Handles IBM Cloud IAM authentication token generation and caching.

public class TokenGenerator {
    public TokenGenerator(URL iamUrl, Duration timeout, String grantType, String apiKey);
    public Uni<String> generate();
}

Constructor Parameters:

  • iamUrl - IBM Cloud IAM endpoint URL (typically https://iam.cloud.ibm.com)
  • timeout - Request timeout for token generation
  • grantType - Grant type for IAM authentication (typically "urn:ibm:params:oauth:grant-type:apikey")
  • apiKey - IBM Cloud API key

Methods:

  • generate() - Returns a Uni<String> with the access token, automatically managing token lifecycle and caching

The TokenGenerator automatically:

  • Caches tokens until expiration
  • Refreshes expired tokens on demand
  • Thread-safe token generation with internal semaphore

Example:

import io.quarkiverse.langchain4j.watsonx.runtime.TokenGenerator;
import java.net.URL;
import java.time.Duration;

TokenGenerator tokenGen = new TokenGenerator(
    new URL("https://iam.cloud.ibm.com"),
    Duration.ofSeconds(30),
    "urn:ibm:params:oauth:grant-type:apikey",
    "your-ibm-cloud-api-key"
);

// Token is generated and cached automatically
WatsonxChatModel model = WatsonxChatModel.builder()
    .tokenGenerator(tokenGen)
    .build();

For Quarkus applications, token generation is configured automatically through application.properties and does not require manual TokenGenerator creation.

Chat Request and Response Types

From LangChain4j:

// Request
public class ChatRequest {
    List<ChatMessage> messages();
    List<ToolSpecification> toolSpecifications();
    ChatRequestParameters parameters();
}

// Response
public class ChatResponse {
    AiMessage aiMessage();
    TokenUsage tokenUsage();
    FinishReason finishReason();
    ChatResponseMetadata metadata();
}

// Token usage
public class TokenUsage {
    Integer inputTokenCount();
    Integer outputTokenCount();
    Integer totalTokenCount();
}

// Finish reason
public enum FinishReason {
    STOP,           // Natural completion
    LENGTH,         // Max tokens reached
    TOOL_EXECUTION, // Tool call requested
    CONTENT_FILTER, // Content filtered
    OTHER
}

Message Types

From LangChain4j:

public interface ChatMessage {
    MessageType type();
}

public class SystemMessage implements ChatMessage {
    public static SystemMessage from(String text);
    public String text();
}

public class UserMessage implements ChatMessage {
    public static UserMessage from(String text);
    public static UserMessage from(String text, List<Content> contents);
    public String text();
    public List<Content> contents();
}

public class AiMessage implements ChatMessage {
    public static AiMessage from(String text);
    public String text();
    public boolean hasToolExecutionRequests();
    public List<ToolExecutionRequest> toolExecutionRequests();
}

public class ToolExecutionResultMessage implements ChatMessage {
    public static ToolExecutionResultMessage from(ToolExecutionRequest request, String result);
    public String id();
    public String toolName();
    public String text();
}

Content Types

From LangChain4j:

public interface Content {
    ContentType type();
}

public class TextContent implements Content {
    public static TextContent from(String text);
    public String text();
}

public class ImageContent implements Content {
    public static ImageContent from(String url);
    public static ImageContent from(String url, ImageContentDetail detail);
    public String url();
    public ImageContentDetail detail();
}

public enum ImageContentDetail {
    AUTO, LOW, HIGH
}

Tool Types

From LangChain4j:

public class ToolSpecification {
    String name();
    String description();
    Map<String, Object> parameters();

    public static Builder builder() {
        return new Builder();
    }

    public static class Builder {
        public Builder name(String name);
        public Builder description(String description);
        public Builder addParameter(String name, String type, String description);
        public Builder parameters(Map<String, Object> parameters);
        public ToolSpecification build();
    }
}

public class ToolExecutionRequest {
    String id();
    String name();
    Map<String, Object> arguments();
}

public enum ToolChoice {
    AUTO,     // Model decides whether to use tools
    REQUIRED  // Model must use at least one tool
}

Capability

From LangChain4j:

public enum Capability {
    JSON_SCHEMA_RESPONSE_FORMAT  // Supports JSON schema validation
}

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-watsonx

docs

builtin-services.md

chat-models.md

configuration.md

embedding-scoring.md

exceptions.md

generation-models.md

index.md

request-parameters.md

text-extraction.md

tile.json