Quarkus LangChain4j OpenAI extension provides seamless integration between Quarkus and OpenAI's Large Language Models, enabling developers to easily incorporate LLMs into their applications with support for chat, streaming, embeddings, moderation, and image generation.
Comprehensive API reference for OpenAI chat models in Quarkus, covering both synchronous chat models and streaming chat models. The extension provides an enhanced builder pattern through Service Provider Interface (SPI) registration, adding Quarkus-specific capabilities to the standard LangChain4j builders.
The chat models implementation uses an SPI-based pattern where Quarkus-enhanced builders are automatically used when creating OpenAI chat models. The builders extend LangChain4j's base builders to add:
Factory class implementing the Service Provider Interface for creating Quarkus-enhanced OpenAI chat model builders.
/**
* SPI factory for creating OpenAI chat models with Quarkus extensions.
*
* Registered via: META-INF/services/dev.langchain4j.model.openai.spi.OpenAiChatModelBuilderFactory
*
* This factory is automatically discovered and used when calling
* OpenAiChatModel.builder(), providing Quarkus-specific functionality
* transparently.
*/
public class QuarkusOpenAiChatModelBuilderFactory
implements OpenAiChatModelBuilderFactory {
/**
* Creates a new Quarkus-enhanced builder instance.
*
* Returns:
* Builder instance with both Quarkus-specific and LangChain4j methods
*/
@Override
public OpenAiChatModel.OpenAiChatModelBuilder get();
}Enhanced builder class extending LangChain4j's OpenAiChatModelBuilder with Quarkus-specific methods.
/**
* Enhanced builder for OpenAI chat models with Quarkus features.
*
* Extends: dev.langchain4j.model.openai.OpenAiChatModel.OpenAiChatModelBuilder
*
* Usage:
* ChatModel model = OpenAiChatModel.builder()
* .configName("premium") // Quarkus-specific
* .tlsConfigurationName("custom-tls") // Quarkus-specific
* .apiKey("sk-...") // LangChain4j inherited
* .modelName("gpt-4o-mini") // LangChain4j inherited
* .build();
*/
public static class Builder extends OpenAiChatModel.OpenAiChatModelBuilder {
/**
* Set the named configuration to use.
*
* Parameters:
* configName - Name of configuration defined in application.properties
*
* Returns:
* This builder for method chaining
*
* When specified, the builder loads settings from the named configuration
* instead of the default configuration. For example, configName("premium")
* loads from quarkus.langchain4j.openai.premium.* properties.
*
* Example:
* .configName("premium") // Uses quarkus.langchain4j.openai.premium.*
*/
public Builder configName(String configName);
/**
* Set the named TLS configuration for HTTPS connections.
*
* Parameters:
* tlsConfigurationName - Name of Quarkus TLS configuration
*
* Returns:
* This builder for method chaining
*
* References a Quarkus named TLS configuration defined via
* quarkus.tls.{name}.* properties for custom certificates,
* client authentication, or custom trust stores.
*
* Example:
* .tlsConfigurationName("custom-certs")
*/
public Builder tlsConfigurationName(String tlsConfigurationName);
/**
* Set HTTP proxy for API requests.
*
* Parameters:
* proxy - java.net.Proxy instance (HTTP or SOCKS)
*
* Returns:
* This builder for method chaining
*
* Configures HTTP proxy for routing OpenAI API requests through
* corporate proxies or network gateways.
*
* Example:
* Proxy proxy = new Proxy(Proxy.Type.HTTP,
* new InetSocketAddress("proxy.company.com", 8080));
* .proxy(proxy)
*/
public Builder proxy(Proxy proxy);
/**
* Enable curl-style request logging.
*
* Parameters:
* logCurl - true to enable curl logging
*
* Returns:
* This builder for method chaining
*
* When enabled, logs requests in curl command format, useful for
* debugging and reproducing requests outside the application.
*
* Example output:
* curl -X POST https://api.openai.com/v1/chat/completions \
* -H "Authorization: Bearer sk-..." \
* -d '{"model":"gpt-4o-mini","messages":[...]}'
*
* Example:
* .logCurl(true)
*/
public Builder logCurl(boolean logCurl);
/**
* Build the OpenAI chat model instance.
*
* Returns:
* Configured OpenAiChatModel instance
*
* Creates the chat model with all configured settings. If configName
* was specified, applies settings from that named configuration.
* Validates required settings and initializes the underlying HTTP client.
*/
@Override
public OpenAiChatModel build();
/**
* Public fields (direct access, though builder methods are recommended).
*/
public String configName; // Named configuration reference
public String tlsConfigurationName; // Named TLS configuration
public boolean logCurl; // Curl logging flag
public Proxy proxy; // HTTP proxy configuration
}All methods from OpenAiChatModel.OpenAiChatModelBuilder are available. Key methods include:
/**
* Core configuration methods inherited from LangChain4j.
*
* These methods are part of the standard LangChain4j API and work
* seamlessly with Quarkus enhancements.
*/
/**
* Set the OpenAI API base URL.
*
* Parameters:
* baseUrl - API endpoint URL
*
* Default: "https://api.openai.com/v1/"
*
* Use for OpenAI-compatible providers or custom deployments.
*
* Example:
* .baseUrl("https://custom-openai.example.com/v1")
*/
public Builder baseUrl(String baseUrl);
/**
* Set the OpenAI API key.
*
* Parameters:
* apiKey - Your OpenAI API key (format: sk-...)
*
* Required: Yes
*
* Obtain from: https://platform.openai.com/api-keys
*
* Example:
* .apiKey("sk-proj-...")
*/
public Builder apiKey(String apiKey);
/**
* Set the OpenAI organization ID.
*
* Parameters:
* organizationId - Organization identifier
*
* Required: No
*
* For users belonging to multiple organizations. Find at:
* https://platform.openai.com/account/organization
*
* Example:
* .organizationId("org-...")
*/
public Builder organizationId(String organizationId);
/**
* Set the model name.
*
* Parameters:
* modelName - OpenAI model identifier
*
* Default: "gpt-4o-mini"
*
* Common values:
* - "gpt-4o-mini" - Fast, cost-effective model
* - "gpt-4o" - High-intelligence flagship model
* - "gpt-4-turbo" - GPT-4 Turbo with 128K context
* - "o1-preview" - Advanced reasoning model
* - "o1-mini" - Fast reasoning model
* - "gpt-3.5-turbo" - Legacy fast model
*
* Example:
* .modelName("gpt-4o")
*/
public Builder modelName(String modelName);
/**
* Set sampling temperature.
*
* Parameters:
* temperature - Sampling temperature (0.0 to 2.0)
*
* Default: 1.0
*
* Controls randomness in responses:
* - 0.0: Deterministic, focused outputs
* - 0.7: Balanced creativity and consistency
* - 1.0: Default OpenAI setting
* - 1.5-2.0: Highly creative, more random
*
* Recommendation: Use either temperature OR topP, not both.
* Higher temperature means more risk-taking by the model.
*
* Example:
* .temperature(0.7)
*/
public Builder temperature(Double temperature);
/**
* Set nucleus sampling parameter.
*
* Parameters:
* topP - Nucleus sampling threshold (0.0 to 1.0)
*
* Default: 1.0
*
* Alternative to temperature sampling. The model considers results
* of tokens with topP probability mass:
* - 0.1: Only top 10% probability tokens
* - 0.5: Top 50% probability tokens
* - 1.0: All tokens considered
*
* Recommendation: Use either topP OR temperature, not both.
*
* Example:
* .topP(0.9)
*/
public Builder topP(Double topP);
/**
* Set maximum tokens to generate.
*
* Parameters:
* maxTokens - Maximum number of tokens
*
* Deprecated: Use maxCompletionTokens() instead
*
* This method is deprecated in favor of maxCompletionTokens which
* provides more accurate control for newer models, especially
* reasoning models where output includes reasoning tokens.
*
* Example:
* .maxTokens(2048) // Deprecated
*/
@Deprecated
public Builder maxTokens(Integer maxTokens);
/**
* Set maximum completion tokens.
*
* Parameters:
* maxCompletionTokens - Upper bound for completion tokens
*
* Recommended: Use this instead of deprecated maxTokens()
*
* For newer OpenAI models, this provides an upper bound for tokens
* generated in the completion, including both visible output tokens
* and reasoning tokens (for reasoning models like o1).
*
* The total of prompt tokens + maxCompletionTokens cannot exceed
* the model's context window:
* - gpt-4o-mini: 128K context
* - gpt-4o: 128K context
* - o1-preview: 128K context
*
* Example:
* .maxCompletionTokens(4096)
*/
public Builder maxCompletionTokens(Integer maxCompletionTokens);
/**
* Set presence penalty.
*
* Parameters:
* presencePenalty - Penalty value (-2.0 to 2.0)
*
* Default: 0.0
*
* Penalizes new tokens based on whether they appear in the text so far.
* Positive values increase the model's likelihood to talk about new
* topics rather than repeating existing ones.
*
* - Negative values: Encourage repetition
* - 0.0: No penalty
* - Positive values: Discourage repetition, encourage new topics
*
* Example:
* .presencePenalty(0.6) // Encourage discussing new topics
*/
public Builder presencePenalty(Double presencePenalty);
/**
* Set frequency penalty.
*
* Parameters:
* frequencyPenalty - Penalty value (-2.0 to 2.0)
*
* Default: 0.0
*
* Penalizes new tokens based on their existing frequency in the text
* so far. Positive values decrease the model's likelihood to repeat
* the same line verbatim.
*
* - Negative values: Allow more repetition
* - 0.0: No penalty
* - Positive values: Discourage verbatim repetition
*
* Example:
* .frequencyPenalty(0.5) // Reduce verbatim repetition
*/
public Builder frequencyPenalty(Double frequencyPenalty);
/**
* Set request timeout.
*
* Parameters:
* timeout - Maximum time to wait for response
*
* Default: 10 seconds
*
* Maximum duration to wait for OpenAI API responses.
* For longer-running requests (e.g., complex reasoning),
* increase this value.
*
* Example:
* .timeout(Duration.ofSeconds(30))
*/
public Builder timeout(Duration timeout);
/**
* Set maximum retry attempts.
*
* Parameters:
* maxRetries - Maximum number of retries
*
* Default: 1 (no retries)
* Deprecated: Use MicroProfile Fault Tolerance instead
*
* Number of retry attempts for failed requests. Built-in retry
* is deprecated in favor of MicroProfile Fault Tolerance patterns.
*
* Example:
* .maxRetries(3)
*/
@Deprecated
public Builder maxRetries(Integer maxRetries);
/**
* Enable request logging.
*
* Parameters:
* logRequests - true to log requests
*
* Default: false
*
* When enabled, logs full request payloads sent to OpenAI API.
* Useful for debugging but may expose sensitive data in logs.
*
* Example:
* .logRequests(true)
*/
public Builder logRequests(Boolean logRequests);
/**
* Enable response logging.
*
* Parameters:
* logResponses - true to log responses
*
* Default: false
*
* When enabled, logs full response payloads from OpenAI API.
* Useful for debugging and monitoring.
*
* Example:
* .logResponses(true)
*/
public Builder logResponses(Boolean logResponses);
/**
* Set response format for structured outputs.
*
* Parameters:
* responseFormat - Format specification string
*
* Enables JSON mode or structured outputs. Common values:
* - "json_object": Request JSON-formatted responses
* - "json_schema": Enforce specific JSON schema
* - "text": Default text responses
*
* When using "json_object", include "JSON" in the system message
* or user message to ensure consistent JSON responses.
*
* Example:
* .responseFormat("json_object")
*/
public Builder responseFormat(String responseFormat);
/**
* Enable strict JSON schema validation.
*
* Parameters:
* strictJsonSchema - true for strict validation
*
* Default: false
*
* When enabled with responseFormat("json_schema"), enforces that
* responses strictly follow the provided JSON schema. Requires
* compatible models (gpt-4o-mini, gpt-4o, etc.).
*
* Example:
* .responseFormat("json_schema")
* .strictJsonSchema(true)
*/
public Builder strictJsonSchema(Boolean strictJsonSchema);
/**
* Set stop sequences.
*
* Parameters:
* stop - List of stop sequences (up to 4)
*
* The model will stop generating when it encounters any of these
* sequences. The stop sequence will not be included in the output.
*
* Example:
* .stop(List.of("\n\n", "END", "###"))
*/
public Builder stop(List<String> stop);
/**
* Set service tier for request processing.
*
* Parameters:
* serviceTier - Service tier identifier
*
* Default: "auto"
*
* Controls processing priority and pricing:
* - "auto": Use project default setting
* - "default": Standard pricing and performance
* - "flex": Lower priority, may have longer latency
* - "priority": Guaranteed higher priority processing
*
* The actual tier used is returned in the response and may differ
* from the requested tier based on availability.
*
* Example:
* .serviceTier("priority")
*/
public Builder serviceTier(String serviceTier);
/**
* Set default request parameters.
*
* Parameters:
* defaultRequestParameters - OpenAiChatRequestParameters instance
*
* Provides access to advanced parameters including reasoningEffort
* for reasoning models (o1 series).
*
* For reasoning models, reasoningEffort controls compute vs. speed:
* - "minimal": Fastest, least reasoning
* - "low": Quick responses with basic reasoning
* - "medium": Balanced reasoning and speed
* - "high": Most thorough reasoning (gpt-5-pro default)
*
* Example:
* OpenAiChatRequestParameters params = OpenAiChatRequestParameters.builder()
* .reasoningEffort("high")
* .build();
* .defaultRequestParameters(params)
*/
public Builder defaultRequestParameters(OpenAiChatRequestParameters defaultRequestParameters);
/**
* Add chat model listeners for observability.
*
* Parameters:
* listeners - List of ChatModelListener instances
*
* Listeners receive callbacks for request/response events, enabling:
* - Custom logging and monitoring
* - Metrics collection
* - Cost tracking
* - Distributed tracing integration
*
* Example:
* List<ChatModelListener> listeners = List.of(
* new MetricsListener(),
* new CostTrackingListener()
* );
* .listeners(listeners)
*/
public Builder listeners(List<ChatModelListener> listeners);Factory class for creating Quarkus-enhanced OpenAI streaming chat model builders.
/**
* SPI factory for creating OpenAI streaming chat models with Quarkus extensions.
*
* Registered via: META-INF/services/dev.langchain4j.model.openai.spi.OpenAiStreamingChatModelBuilderFactory
*
* This factory is automatically discovered and used when calling
* OpenAiStreamingChatModel.builder(), providing Quarkus-specific
* functionality for streaming responses.
*/
public class QuarkusOpenAiStreamingChatModelBuilderFactory
implements OpenAiStreamingChatModelBuilderFactory {
/**
* Creates a new Quarkus-enhanced streaming builder instance.
*
* Returns:
* Builder instance for streaming chat models
*/
@Override
public OpenAiStreamingChatModel.OpenAiStreamingChatModelBuilder get();
}Enhanced builder for streaming chat models with Quarkus features.
/**
* Enhanced builder for OpenAI streaming chat models with Quarkus features.
*
* Extends: dev.langchain4j.model.openai.OpenAiStreamingChatModel.OpenAiStreamingChatModelBuilder
*
* Streaming models provide responses incrementally as they are generated,
* enabling real-time user experiences with streaming text display.
*
* Usage:
* StreamingChatModel model = OpenAiStreamingChatModel.builder()
* .configName("streaming")
* .apiKey("sk-...")
* .modelName("gpt-4o-mini")
* .build();
*/
public static class Builder extends OpenAiStreamingChatModel.OpenAiStreamingChatModelBuilder {
/**
* Set the named configuration to use.
*
* Parameters:
* configName - Name of configuration defined in application.properties
*
* Returns:
* This builder for method chaining
*
* When specified, loads settings from the named configuration.
* For example, configName("streaming") loads from
* quarkus.langchain4j.openai.streaming.* properties.
*
* Example:
* .configName("streaming")
*/
public Builder configName(String configName);
/**
* Set the named TLS configuration for HTTPS connections.
*
* Parameters:
* tlsConfigurationName - Name of Quarkus TLS configuration
*
* Returns:
* This builder for method chaining
*
* References a Quarkus named TLS configuration for custom
* certificates or client authentication.
*
* Example:
* .tlsConfigurationName("custom-certs")
*/
public Builder tlsConfigurationName(String tlsConfigurationName);
/**
* Set HTTP proxy for API requests.
*
* Parameters:
* proxy - java.net.Proxy instance
*
* Returns:
* This builder for method chaining
*
* Configures HTTP proxy for streaming API requests.
*
* Example:
* .proxy(new Proxy(Proxy.Type.HTTP,
* new InetSocketAddress("proxy.example.com", 8080)))
*/
public Builder proxy(Proxy proxy);
/**
* Enable curl-style request logging.
*
* Parameters:
* logCurl - true to enable curl logging
*
* Returns:
* This builder for method chaining
*
* Logs streaming requests in curl format for debugging.
* Note: Streaming responses are not included in curl format.
*
* Example:
* .logCurl(true)
*/
public Builder logCurl(boolean logCurl);
/**
* Build the OpenAI streaming chat model instance.
*
* Returns:
* Configured OpenAiStreamingChatModel instance
*
* Creates the streaming chat model with all configured settings.
*/
@Override
public OpenAiStreamingChatModel build();
/**
* Public fields (direct access, though builder methods are recommended).
*/
public String configName; // Named configuration reference
public String tlsConfigurationName; // Named TLS configuration
public boolean logCurl; // Curl logging flag
public Proxy proxy; // HTTP proxy configuration
}The streaming builder inherits most methods from the non-streaming builder, with the same semantics. Key methods include:
/**
* Streaming-specific configuration methods inherited from LangChain4j.
*
* Most configuration methods are identical to non-streaming builders.
* The key differences are in how responses are consumed.
*/
/**
* Set the OpenAI API base URL.
*/
public Builder baseUrl(String baseUrl);
/**
* Set the OpenAI API key.
*/
public Builder apiKey(String apiKey);
/**
* Set the OpenAI organization ID.
*/
public Builder organizationId(String organizationId);
/**
* Set the model name.
*
* Common streaming models:
* - "gpt-4o-mini" - Fast streaming
* - "gpt-4o" - High-quality streaming
* - "gpt-4-turbo" - GPT-4 Turbo streaming
*/
public Builder modelName(String modelName);
/**
* Set sampling temperature (0.0 to 2.0).
*/
public Builder temperature(Double temperature);
/**
* Set nucleus sampling parameter (0.0 to 1.0).
*/
public Builder topP(Double topP);
/**
* Set maximum tokens to generate.
*
* Deprecated: Use maxCompletionTokens() instead
*/
@Deprecated
public Builder maxTokens(Integer maxTokens);
/**
* Set maximum completion tokens.
*/
public Builder maxCompletionTokens(Integer maxCompletionTokens);
/**
* Set presence penalty (-2.0 to 2.0).
*/
public Builder presencePenalty(Double presencePenalty);
/**
* Set frequency penalty (-2.0 to 2.0).
*/
public Builder frequencyPenalty(Double frequencyPenalty);
/**
* Set request timeout.
*
* Note: For streaming, this is the timeout for the initial connection
* and first token. The stream can continue beyond this timeout once
* established.
*/
public Builder timeout(Duration timeout);
/**
* Enable request logging.
*/
public Builder logRequests(Boolean logRequests);
/**
* Enable response logging.
*
* For streaming, logs the complete response after the stream completes.
*/
public Builder logResponses(Boolean logResponses);
/**
* Set response format for structured outputs.
*/
public Builder responseFormat(String responseFormat);
/**
* Set stop sequences.
*/
public Builder stop(List<String> stop);
/**
* Add chat model listeners for observability.
*
* Listeners can observe streaming events including partial responses.
*/
public Builder listeners(List<ChatModelListener> listeners);Configuration interface for chat model settings, used for declarative configuration.
/**
* Configuration interface for OpenAI chat models.
*
* Configuration prefix:
* - Default: quarkus.langchain4j.openai.chat-model
* - Named: quarkus.langchain4j.openai.{name}.chat-model
*
* All properties can be set in application.properties or application.yaml.
*/
@ConfigGroup
public interface ChatModelConfig {
/**
* Model name to use.
*
* Property: model-name
* Default: "gpt-4o-mini"
*
* Returns:
* The OpenAI model identifier
*/
@WithDefault("gpt-4o-mini")
String modelName();
/**
* Sampling temperature.
*
* Property: temperature
* Default: 1.0
* Range: 0.0 to 2.0
*
* Returns:
* The temperature value
*/
@WithDefault("1.0")
Double temperature();
/**
* Nucleus sampling parameter.
*
* Property: top-p
* Default: 1.0
* Range: 0.0 to 1.0
*
* Returns:
* The topP value
*/
@WithDefault("1.0")
Double topP();
/**
* Maximum tokens to generate.
*
* Property: max-tokens
* Deprecated: Use max-completion-tokens instead
*
* Returns:
* Optional max tokens value
*/
@Deprecated
Optional<Integer> maxTokens();
/**
* Maximum completion tokens.
*
* Property: max-completion-tokens
*
* Returns:
* Optional max completion tokens value
*/
Optional<Integer> maxCompletionTokens();
/**
* Presence penalty.
*
* Property: presence-penalty
* Default: 0.0
* Range: -2.0 to 2.0
*
* Returns:
* The presence penalty value
*/
@WithDefault("0")
Double presencePenalty();
/**
* Frequency penalty.
*
* Property: frequency-penalty
* Default: 0.0
* Range: -2.0 to 2.0
*
* Returns:
* The frequency penalty value
*/
@WithDefault("0")
Double frequencyPenalty();
/**
* Enable request logging.
*
* Property: log-requests
* Default: false
*
* Returns:
* Optional boolean for request logging
*/
Optional<Boolean> logRequests();
/**
* Enable response logging.
*
* Property: log-responses
* Default: false
*
* Returns:
* Optional boolean for response logging
*/
Optional<Boolean> logResponses();
/**
* Response format specification.
*
* Property: response-format
*
* Returns:
* Optional response format string
*/
Optional<String> responseFormat();
/**
* Enable strict JSON schema validation.
*
* Property: strict-json-schema
*
* Returns:
* Optional boolean for strict schema validation
*/
Optional<Boolean> strictJsonSchema();
/**
* Stop sequences.
*
* Property: stop
*
* Returns:
* Optional list of stop sequences
*/
Optional<List<String>> stop();
/**
* Reasoning effort for reasoning models.
*
* Property: reasoning-effort
* Valid values: "minimal", "low", "medium", "high"
*
* Returns:
* Optional reasoning effort level
*/
Optional<String> reasoningEffort();
/**
* Service tier for request processing.
*
* Property: service-tier
* Default: "default"
* Valid values: "auto", "default", "flex", "priority"
*
* Returns:
* Optional service tier specification
*/
Optional<String> serviceTier();
}Simple chat model usage with configuration:
// application.properties
// quarkus.langchain4j.openai.api-key=sk-...
// quarkus.langchain4j.openai.chat-model.model-name=gpt-4o-mini
// quarkus.langchain4j.openai.chat-model.temperature=0.7
import jakarta.inject.Inject;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.output.Response;
public class ChatService {
@Inject
ChatModel chatModel;
public String askQuestion(String question) {
// Simple text-to-text chat
String response = chatModel.chat(question);
return response;
}
public String conversationWithContext(String userMessage) {
// Using full message API for more control
Response<AiMessage> response = chatModel.chat(
UserMessage.from(userMessage)
);
return response.content().text();
}
}Implementing streaming responses for real-time user feedback:
import dev.langchain4j.model.chat.StreamingChatModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.StreamingChatResponseHandler;
import dev.langchain4j.model.chat.ChatResponse;
import jakarta.inject.Inject;
public class StreamingChatService {
@Inject
StreamingChatModel streamingChatModel;
public void streamResponse(String userMessage,
java.util.function.Consumer<String> tokenHandler,
java.util.function.Consumer<ChatResponse> completionHandler) {
streamingChatModel.chat(userMessage, new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String token) {
// Called for each token as it arrives
tokenHandler.accept(token);
}
@Override
public void onCompleteResponse(ChatResponse response) {
// Called when streaming completes
completionHandler.accept(response);
}
@Override
public void onError(Throwable error) {
System.err.println("Streaming error: " + error.getMessage());
}
});
}
// Example usage with Server-Sent Events (SSE)
public void streamToClient(String userMessage, jakarta.ws.rs.sse.SseEventSink sseEventSink) {
streamingChatModel.chat(userMessage, new StreamingChatResponseHandler() {
@Override
public void onPartialResponse(String token) {
// Send each token to client via SSE
sseEventSink.send(
jakarta.ws.rs.sse.OutboundSseEvent.Builder.newInstance()
.data(token)
.build()
);
}
@Override
public void onCompleteResponse(ChatResponse response) {
sseEventSink.close();
}
@Override
public void onError(Throwable error) {
sseEventSink.close();
}
});
}
}Managing multiple chat models with different configurations:
# application.properties
# Default configuration - for general queries
quarkus.langchain4j.openai.api-key=sk-default-key
quarkus.langchain4j.openai.chat-model.model-name=gpt-4o-mini
quarkus.langchain4j.openai.chat-model.temperature=0.7
# Premium configuration - for complex analysis
quarkus.langchain4j.openai.premium.api-key=sk-premium-key
quarkus.langchain4j.openai.premium.chat-model.model-name=gpt-4o
quarkus.langchain4j.openai.premium.chat-model.temperature=0.3
quarkus.langchain4j.openai.premium.chat-model.max-completion-tokens=8192
# Creative configuration - for content generation
quarkus.langchain4j.openai.creative.api-key=sk-creative-key
quarkus.langchain4j.openai.creative.chat-model.model-name=gpt-4o
quarkus.langchain4j.openai.creative.chat-model.temperature=1.2
quarkus.langchain4j.openai.creative.chat-model.presence-penalty=0.6import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.chat.ChatModel;
public class MultiModelService {
private final ChatModel defaultModel;
private final ChatModel premiumModel;
private final ChatModel creativeModel;
public MultiModelService() {
// Each model uses its named configuration
this.defaultModel = OpenAiChatModel.builder()
.configName("default") // Uses default config
.build();
this.premiumModel = OpenAiChatModel.builder()
.configName("premium") // Uses premium config
.build();
this.creativeModel = OpenAiChatModel.builder()
.configName("creative") // Uses creative config
.build();
}
public String analyzeComplexQuery(String query) {
// Use premium model for complex analysis
return premiumModel.chat(query);
}
public String generateCreativeContent(String prompt) {
// Use creative model for content generation
return creativeModel.chat(prompt);
}
public String quickAnswer(String question) {
// Use default model for quick responses
return defaultModel.chat(question);
}
}Using JSON mode to extract structured data:
# application.properties
quarkus.langchain4j.openai.api-key=sk-...
quarkus.langchain4j.openai.chat-model.model-name=gpt-4o-mini
quarkus.langchain4j.openai.chat-model.response-format=json_object
quarkus.langchain4j.openai.chat-model.strict-json-schema=trueimport dev.langchain4j.model.chat.ChatModel;
import jakarta.inject.Inject;
import jakarta.json.Json;
import jakarta.json.JsonObject;
import jakarta.json.JsonReader;
import java.io.StringReader;
public class StructuredDataService {
@Inject
ChatModel chatModel;
public JsonObject extractPersonInfo(String text) {
String prompt = """
Extract person information from the following text as JSON.
Include fields: name, age, occupation, location.
Text: %s
Respond only with valid JSON.
""".formatted(text);
String jsonResponse = chatModel.chat(prompt);
// Parse JSON response
try (JsonReader reader = Json.createReader(new StringReader(jsonResponse))) {
return reader.readObject();
}
}
public record Product(String name, String category, double price, String description) {}
public Product extractProductInfo(String description) {
String prompt = """
Extract product information from the description as JSON.
Required JSON format:
{
"name": "product name",
"category": "product category",
"price": numeric_price,
"description": "brief description"
}
Description: %s
""".formatted(description);
String jsonResponse = chatModel.chat(prompt);
try (JsonReader reader = Json.createReader(new StringReader(jsonResponse))) {
JsonObject obj = reader.readObject();
return new Product(
obj.getString("name"),
obj.getString("category"),
obj.getJsonNumber("price").doubleValue(),
obj.getString("description")
);
}
}
}Using OpenAI's reasoning models (o1 series) with reasoning effort control:
# application.properties
# Reasoning model configuration
quarkus.langchain4j.openai.reasoning.api-key=sk-...
quarkus.langchain4j.openai.reasoning.chat-model.model-name=o1-preview
quarkus.langchain4j.openai.reasoning.chat-model.reasoning-effort=high
quarkus.langchain4j.openai.reasoning.chat-model.max-completion-tokens=16384
quarkus.langchain4j.openai.reasoning.timeout=60s
# Fast reasoning configuration
quarkus.langchain4j.openai.fast-reasoning.api-key=sk-...
quarkus.langchain4j.openai.fast-reasoning.chat-model.model-name=o1-mini
quarkus.langchain4j.openai.fast-reasoning.chat-model.reasoning-effort=medium
quarkus.langchain4j.openai.fast-reasoning.chat-model.max-completion-tokens=8192import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.chat.ChatModel;
import java.time.Duration;
public class ReasoningModelService {
private final ChatModel reasoningModel;
private final ChatModel fastReasoningModel;
public ReasoningModelService() {
// High-effort reasoning model for complex problems
this.reasoningModel = OpenAiChatModel.builder()
.configName("reasoning")
.build();
// Fast reasoning model for simpler problems
this.fastReasoningModel = OpenAiChatModel.builder()
.configName("fast-reasoning")
.build();
}
public String solveComplexProblem(String problem) {
String prompt = """
Solve the following problem step by step.
Show your reasoning process.
Problem: %s
""".formatted(problem);
// Uses high reasoning effort for thorough analysis
return reasoningModel.chat(prompt);
}
public String quickReasoning(String question) {
// Uses medium reasoning effort for faster responses
return fastReasoningModel.chat(question);
}
// Programmatic reasoning model configuration
public ChatModel createCustomReasoningModel() {
return OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("o1-preview")
.maxCompletionTokens(32768)
.timeout(Duration.ofSeconds(90))
.defaultRequestParameters(
dev.langchain4j.model.openai.OpenAiChatRequestParameters.builder()
.reasoningEffort("high")
.build()
)
.build();
}
}Configuring chat models for enterprise environments:
# application.properties
# Enterprise configuration with proxy and custom TLS
quarkus.langchain4j.openai.enterprise.api-key=sk-...
quarkus.langchain4j.openai.enterprise.base-url=https://api.openai.com/v1/
quarkus.langchain4j.openai.enterprise.tls-configuration-name=custom-certs
quarkus.langchain4j.openai.enterprise.proxy-type=HTTP
quarkus.langchain4j.openai.enterprise.proxy-host=proxy.company.com
quarkus.langchain4j.openai.enterprise.proxy-port=8080
quarkus.langchain4j.openai.enterprise.log-requests-curl=true
quarkus.langchain4j.openai.enterprise.chat-model.model-name=gpt-4o-mini
quarkus.langchain4j.openai.enterprise.timeout=30s
# Custom TLS configuration
quarkus.tls.custom-certs.trust-store.pem.certs=company-root-ca.pem
quarkus.tls.custom-certs.key-store.p12.path=client-cert.p12
quarkus.tls.custom-certs.key-store.p12.password=changeitimport dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.chat.ChatModel;
import java.net.InetSocketAddress;
import java.net.Proxy;
public class EnterpriseModelService {
// Using configuration-based approach
private final ChatModel configuredModel;
// Using programmatic approach
private final ChatModel programmaticModel;
public EnterpriseModelService() {
// Configuration-based (recommended for enterprise)
this.configuredModel = OpenAiChatModel.builder()
.configName("enterprise")
.build();
// Programmatic configuration
Proxy proxy = new Proxy(
Proxy.Type.HTTP,
new InetSocketAddress("proxy.company.com", 8080)
);
this.programmaticModel = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o-mini")
.tlsConfigurationName("custom-certs")
.proxy(proxy)
.logCurl(true) // Debug with curl-style logging
.logRequests(true)
.logResponses(true)
.build();
}
public String secureQuery(String query) {
// Uses enterprise configuration with TLS and proxy
return configuredModel.chat(query);
}
}Best practices for sampling parameters:
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.chat.ChatModel;
public class SamplingParametersDemo {
// Deterministic responses - use for consistent outputs
private final ChatModel deterministicModel = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o-mini")
.temperature(0.0) // Deterministic
.build();
// Balanced creativity - good for most use cases
private final ChatModel balancedModel = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o-mini")
.temperature(0.7) // Balanced
.build();
// High creativity - for creative writing
private final ChatModel creativeModel = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o")
.temperature(1.2) // More creative
.presencePenalty(0.6) // Encourage new topics
.frequencyPenalty(0.5) // Reduce repetition
.build();
// Using topP instead of temperature
private final ChatModel nucleusSamplingModel = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o-mini")
.topP(0.9) // Use nucleus sampling instead
// Don't set temperature when using topP
.build();
public String factualQuery(String question) {
// Use deterministic model for factual questions
return deterministicModel.chat(question);
}
public String creativeStory(String prompt) {
// Use creative model for storytelling
return creativeModel.chat(prompt);
}
public String generalQuery(String question) {
// Use balanced model for general queries
return balancedModel.chat(question);
}
}Implementing custom listeners for monitoring and cost tracking:
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.listener.ChatModelListener;
import dev.langchain4j.model.chat.listener.ChatModelRequest;
import dev.langchain4j.model.chat.listener.ChatModelResponse;
import dev.langchain4j.model.openai.OpenAiChatModel;
import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;
public class ObservabilityExample {
// Custom listener for metrics
public static class MetricsListener implements ChatModelListener {
private final AtomicInteger requestCount = new AtomicInteger(0);
private final AtomicLong totalTokens = new AtomicLong(0);
@Override
public void onRequest(ChatModelRequest request) {
requestCount.incrementAndGet();
System.out.println("Request #" + requestCount.get());
}
@Override
public void onResponse(ChatModelResponse response) {
if (response.tokenUsage() != null) {
long tokens = response.tokenUsage().totalTokenCount();
totalTokens.addAndGet(tokens);
System.out.println("Tokens used: " + tokens);
System.out.println("Total tokens: " + totalTokens.get());
}
}
@Override
public void onError(Throwable error) {
System.err.println("Error: " + error.getMessage());
}
public int getRequestCount() {
return requestCount.get();
}
public long getTotalTokens() {
return totalTokens.get();
}
}
// Custom listener for cost tracking
public static class CostTrackingListener implements ChatModelListener {
private static final double COST_PER_1K_INPUT_TOKENS = 0.00015;
private static final double COST_PER_1K_OUTPUT_TOKENS = 0.0006;
private final AtomicLong totalCostCents = new AtomicLong(0);
@Override
public void onResponse(ChatModelResponse response) {
if (response.tokenUsage() != null) {
long inputTokens = response.tokenUsage().inputTokenCount();
long outputTokens = response.tokenUsage().outputTokenCount();
double costInput = (inputTokens / 1000.0) * COST_PER_1K_INPUT_TOKENS;
double costOutput = (outputTokens / 1000.0) * COST_PER_1K_OUTPUT_TOKENS;
double totalCost = costInput + costOutput;
long costCents = (long) (totalCost * 100);
totalCostCents.addAndGet(costCents);
System.out.printf("Request cost: $%.4f (Total: $%.4f)%n",
totalCost, totalCostCents.get() / 100.0);
}
}
public double getTotalCostDollars() {
return totalCostCents.get() / 100.0;
}
}
public static void main(String[] args) {
MetricsListener metricsListener = new MetricsListener();
CostTrackingListener costListener = new CostTrackingListener();
ChatModel model = OpenAiChatModel.builder()
.apiKey("sk-...")
.modelName("gpt-4o-mini")
.listeners(List.of(metricsListener, costListener))
.build();
// Use the model
model.chat("What is the capital of France?");
model.chat("Explain quantum computing in simple terms");
// View metrics
System.out.println("Total requests: " + metricsListener.getRequestCount());
System.out.println("Total tokens: " + metricsListener.getTotalTokens());
System.out.println("Total cost: $" + costListener.getTotalCostDollars());
}
}Configuration-based (Recommended for production):
Programmatic (Recommended for dynamic scenarios):
Always use maxCompletionTokens() instead of deprecated maxTokens():
// Correct approach
ChatModel model = OpenAiChatModel.builder()
.maxCompletionTokens(4096) // Recommended
.build();
// Deprecated approach
ChatModel model = OpenAiChatModel.builder()
.maxTokens(4096) // Deprecated
.build();Choose one sampling method, not both:
// Correct - use temperature
.temperature(0.7)
// Correct - use topP
.topP(0.9)
// Incorrect - don't use both
.temperature(0.7)
.topP(0.9) // Don't do thisLimit to 4 stop sequences maximum:
// Correct
.stop(List.of("\n\n", "END", "###"))
// Incorrect - too many
.stop(List.of("a", "b", "c", "d", "e")) // Maximum 4For reasoning models (o1 series), use appropriate configuration:
// Reasoning model best practices
OpenAiChatModel model = OpenAiChatModel.builder()
.modelName("o1-preview")
.maxCompletionTokens(16384) // Reasoning needs more tokens
.timeout(Duration.ofSeconds(60)) // Longer timeout
.defaultRequestParameters(
OpenAiChatRequestParameters.builder()
.reasoningEffort("high")
.build()
)
.build();Understand service tier tradeoffs:
Use different logging levels for different environments:
# Development - verbose logging
quarkus.langchain4j.openai.log-requests=true
quarkus.langchain4j.openai.log-responses=true
quarkus.langchain4j.openai.log-requests-curl=true
# Production - minimal logging
quarkus.langchain4j.openai.log-requests=false
quarkus.langchain4j.openai.log-responses=false
quarkus.langchain4j.openai.log-requests-curl=false@RegisterAiServiceInstall with Tessl CLI
npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-openai@1.7.0