CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-azure-openai

Quarkus extension for Azure OpenAI integration with LangChain4j, providing ChatModel, StreamingChatModel, EmbeddingModel, and ImageModel implementations with Azure-specific authentication and configuration support.

Overview
Eval results
Files

chat-models.mddocs/

Chat Models

Azure OpenAI chat models provide both synchronous and streaming interfaces for conversational AI applications. The extension includes AzureOpenAiChatModel for synchronous chat completion and AzureOpenAiStreamingChatModel for token-by-token streaming responses.

Core API

AzureOpenAiChatModel

package io.quarkiverse.langchain4j.azure.openai;

public class AzureOpenAiChatModel implements dev.langchain4j.model.chat.ChatModel {
    /**
     * Execute a chat completion request synchronously.
     *
     * @param chatRequest The chat request containing messages and optional tool specifications
     * @return ChatResponse containing the AI message, token usage, and finish reason
     */
    public ChatResponse doChat(ChatRequest chatRequest);

    /**
     * Create a builder for configuring the chat model.
     *
     * @return Builder instance
     */
    public static Builder builder();
}

AzureOpenAiStreamingChatModel

package io.quarkiverse.langchain4j.azure.openai;

public class AzureOpenAiStreamingChatModel implements dev.langchain4j.model.chat.StreamingChatModel {
    /**
     * Execute a streaming chat completion request where tokens are received incrementally.
     *
     * @param chatRequest The chat request containing messages and optional tool specifications
     * @param handler Handler for processing streaming responses
     */
    public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler);

    /**
     * Create a builder for configuring the streaming chat model.
     *
     * @return Builder instance
     */
    public static Builder builder();
}

Builder API

Both chat model classes provide a builder with the following methods:

Required Configuration

public static class Builder {
    /**
     * Set the Azure OpenAI endpoint URL.
     * Format: https://{resource-name}.openai.azure.com/openai/deployments/{deployment-name}
     *
     * @param endpoint The full endpoint URL (required)
     * @return This builder
     */
    public Builder endpoint(String endpoint);

    /**
     * Set the Azure OpenAI API version.
     * Format: YYYY-MM-DD (e.g., "2024-10-21")
     *
     * @param apiVersion The API version (required)
     * @return This builder
     */
    public Builder apiVersion(String apiVersion);

    /**
     * Set the Azure OpenAI API key for authentication.
     * Either apiKey or adToken must be provided, but not both.
     *
     * @param apiKey The API key
     * @return This builder
     */
    public Builder apiKey(String apiKey);

    /**
     * Set the Azure AD token for authentication.
     * Either apiKey or adToken must be provided, but not both.
     *
     * @param adToken The Azure AD token
     * @return This builder
     */
    public Builder adToken(String adToken);
}

Sampling Parameters

public static class Builder {
    /**
     * Set the sampling temperature (0 to 2).
     * Higher values make output more random, lower values more deterministic.
     * Default: 0.7
     *
     * @param temperature The temperature value
     * @return This builder
     */
    public Builder temperature(Double temperature);

    /**
     * Set the top-p (nucleus sampling) parameter.
     * Alternative to temperature. 0.1 means only tokens in top 10% probability are considered.
     * Default: 1.0
     *
     * @param topP The top-p value
     * @return This builder
     */
    public Builder topP(Double topP);
}

Note: The seed(Integer seed) method is only available on AzureOpenAiChatModel.Builder, not on AzureOpenAiStreamingChatModel.Builder. For the synchronous chat model:

// AzureOpenAiChatModel.Builder only
public static class Builder {
    /**
     * Set the seed for deterministic sampling (AzureOpenAiChatModel only).
     * Makes best effort to return consistent results with same seed and parameters.
     * Requires API version 2023-12-01-preview or later.
     *
     * @param seed The seed value
     * @return This builder
     */
    public Builder seed(Integer seed);
}

Response Control

public static class Builder {
    /**
     * Set maximum number of tokens in the completion.
     * Total tokens (prompt + completion) cannot exceed model's context length.
     *
     * @param maxTokens Maximum tokens to generate
     * @return This builder
     */
    public Builder maxTokens(Integer maxTokens);

    /**
     * Set presence penalty (-2.0 to 2.0).
     * Positive values increase likelihood of talking about new topics.
     * Default: 0
     *
     * @param presencePenalty The presence penalty value
     * @return This builder
     */
    public Builder presencePenalty(Double presencePenalty);

    /**
     * Set frequency penalty (-2.0 to 2.0).
     * Positive values decrease likelihood of repeating the same line verbatim.
     * Default: 0
     *
     * @param frequencyPenalty The frequency penalty value
     * @return This builder
     */
    public Builder frequencyPenalty(Double frequencyPenalty);

    /**
     * Set the response format.
     * Some models support specific formats like "json_object".
     *
     * @param responseFormat The response format (e.g., "text", "json_object")
     * @return This builder
     */
    public Builder responseFormat(String responseFormat);
}

Network and Reliability

public static class Builder {
    /**
     * Set timeout for API calls.
     * Default: 60 seconds
     *
     * @param timeout The timeout duration
     * @return This builder
     */
    public Builder timeout(java.time.Duration timeout);

    /**
     * Set proxy for network requests.
     *
     * @param proxy The proxy configuration
     * @return This builder
     */
    public Builder proxy(java.net.Proxy proxy);
}

Note: The maxRetries(Integer maxRetries) method is only available on AzureOpenAiChatModel.Builder, not on AzureOpenAiStreamingChatModel.Builder. For retry configuration on the synchronous chat model:

// AzureOpenAiChatModel.Builder only
public static class Builder {
    /**
     * Set maximum number of retry attempts (AzureOpenAiChatModel only).
     * Default: 1 (no retries)
     * Deprecated: Use MicroProfile Fault Tolerance instead.
     *
     * @param maxRetries Maximum retry attempts (must be >= 1)
     * @return This builder
     */
    public Builder maxRetries(Integer maxRetries);
}

Observability

public static class Builder {
    /**
     * Enable request logging.
     * Default: false
     *
     * @param logRequests Whether to log requests
     * @return This builder
     */
    public Builder logRequests(Boolean logRequests);

    /**
     * Enable response logging.
     * Default: false
     *
     * @param logResponses Whether to log responses
     * @return This builder
     */
    public Builder logResponses(Boolean logResponses);

    /**
     * Enable cURL-format request logging.
     * Default: false
     *
     * @param logCurl Whether to log requests as cURL commands
     * @return This builder
     */
    public Builder logCurl(Boolean logCurl);

    /**
     * Set chat model listeners for monitoring and custom behavior.
     *
     * @param listeners List of ChatModelListener instances
     * @return This builder
     */
    public Builder listeners(java.util.List<dev.langchain4j.model.chat.listener.ChatModelListener> listeners);
}

Additional Configuration

public static class Builder {
    /**
     * Set token count estimator for tracking token usage.
     *
     * @param tokenizer TokenCountEstimator instance
     * @return This builder
     */
    public Builder tokenizer(dev.langchain4j.model.TokenCountEstimator tokenizer);

    /**
     * Set configuration name for named model instances.
     * Used for CDI integration with @ModelName qualifier.
     *
     * @param configName The configuration name
     * @return This builder
     */
    public Builder configName(String configName);

    /**
     * Build the chat model instance.
     *
     * @return Configured AzureOpenAiChatModel or AzureOpenAiStreamingChatModel instance
     */
    public AzureOpenAiChatModel build();  // or AzureOpenAiStreamingChatModel for streaming
}

Usage Examples

Synchronous Chat

import io.quarkiverse.langchain4j.azure.openai.AzureOpenAiChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;
import java.time.Duration;
import java.util.List;

// Build the model
AzureOpenAiChatModel chatModel = AzureOpenAiChatModel.builder()
    .endpoint("https://my-resource.openai.azure.com/openai/deployments/gpt-4")
    .apiKey("your-api-key")
    .apiVersion("2024-10-21")
    .temperature(0.7)
    .maxTokens(1000)
    .timeout(Duration.ofSeconds(60))
    .build();

// Create a chat request
ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("What is Quarkus?")))
    .build();

// Execute the request
ChatResponse response = chatModel.doChat(request);
String answer = response.aiMessage().text();

Streaming Chat

import io.quarkiverse.langchain4j.azure.openai.AzureOpenAiStreamingChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.StreamingChatResponseHandler;
import dev.langchain4j.data.message.UserMessage;
import java.util.List;

// Build the streaming model
AzureOpenAiStreamingChatModel streamingModel = AzureOpenAiStreamingChatModel.builder()
    .endpoint("https://my-resource.openai.azure.com/openai/deployments/gpt-4")
    .apiKey("your-api-key")
    .apiVersion("2024-10-21")
    .temperature(0.7)
    .build();

// Create a chat request
ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("Explain Azure OpenAI")))
    .build();

// Execute with streaming handler
streamingModel.doChat(request, new StreamingChatResponseHandler() {
    @Override
    public void onPartialResponse(String token) {
        System.out.print(token);  // Print each token as it arrives
    }

    @Override
    public void onCompleteResponse(ChatResponse response) {
        System.out.println("\nComplete!");
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Error: " + error.getMessage());
    }
});

Using with Tool/Function Calling

import dev.langchain4j.agent.tool.ToolSpecification;
import dev.langchain4j.agent.tool.ToolParameters;

// Define a tool specification
ToolSpecification weatherTool = ToolSpecification.builder()
    .name("get_weather")
    .description("Get the current weather for a location")
    .parameters(ToolParameters.builder()
        .required("location")
        .addProperty("location", "string", "City name")
        .addProperty("unit", "string", "Temperature unit (celsius/fahrenheit)")
        .build())
    .build();

// Create request with tool specifications
ChatRequest request = ChatRequest.builder()
    .messages(List.of(UserMessage.from("What's the weather in Paris?")))
    .toolSpecifications(List.of(weatherTool))
    .build();

ChatResponse response = chatModel.doChat(request);

// Check if model wants to call a tool
if (response.aiMessage().hasToolExecutionRequests()) {
    // Handle tool execution requests
}

CDI Integration with Configuration

Configure via application.properties:

quarkus.langchain4j.azure-openai.api-key=your-key
quarkus.langchain4j.azure-openai.endpoint=https://my-resource.openai.azure.com/openai/deployments/gpt-4
quarkus.langchain4j.azure-openai.chat-model.temperature=0.7
quarkus.langchain4j.azure-openai.chat-model.max-tokens=1000

Inject and use:

import dev.langchain4j.model.chat.ChatModel;
import jakarta.inject.Inject;

public class MyService {
    @Inject
    ChatModel chatModel;

    public String askQuestion(String question) {
        return chatModel.generate(question);
    }
}

Multiple Named Configurations

# Production model (conservative)
quarkus.langchain4j.azure-openai.production.api-key=prod-key
quarkus.langchain4j.azure-openai.production.endpoint=https://prod.openai.azure.com/openai/deployments/gpt-4
quarkus.langchain4j.azure-openai.production.chat-model.temperature=0.3

# Creative model (experimental)
quarkus.langchain4j.azure-openai.creative.api-key=creative-key
quarkus.langchain4j.azure-openai.creative.endpoint=https://creative.openai.azure.com/openai/deployments/gpt-4
quarkus.langchain4j.azure-openai.creative.chat-model.temperature=0.9

Inject specific configurations:

import io.quarkiverse.langchain4j.ModelName;

@Inject
@ModelName("production")
ChatModel productionModel;

@Inject
@ModelName("creative")
ChatModel creativeModel;

API Version Compatibility

  • API Version 2023-12-01 and later: Support the tools parameter for function calling
  • Earlier API versions: Automatically use the deprecated functions parameter
  • Default version: 2024-10-21

The extension automatically detects the API version and uses the appropriate parameter format.

Important Notes

  1. Authentication: Exactly one of apiKey or adToken must be provided
  2. Temperature vs Top-P: It's recommended to alter one or the other, not both
  3. Seed Support: Requires API version 2023-12-01-preview or later
  4. Max Retries: Minimum value is 1 (meaning one attempt with no retries)
  5. Context Length: Total tokens (prompt + max_tokens) cannot exceed model's context length (typically 2048-4096 tokens depending on model)
  6. Native Compilation: Both models support Quarkus native compilation
  7. Model Listeners: Available for monitoring, metrics collection, and custom behavior

Related Types

// From LangChain4j framework
package dev.langchain4j.model.chat.request;
public class ChatRequest {
    public List<ChatMessage> messages();
    public List<ToolSpecification> toolSpecifications();
}

package dev.langchain4j.model.chat.response;
public class ChatResponse {
    public dev.langchain4j.data.message.AiMessage aiMessage();
    public dev.langchain4j.model.output.TokenUsage tokenUsage();
    public dev.langchain4j.model.output.FinishReason finishReason();
}

package dev.langchain4j.model.chat.response;
public interface StreamingChatResponseHandler {
    void onPartialResponse(String token);
    void onCompleteResponse(ChatResponse response);
    void onError(Throwable error);
}

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-azure-openai@1.7.0

docs

chat-models.md

configuration.md

embedding-models.md

image-models.md

index.md

tile.json