LangChain4j OpenAI Integration providing Java access to OpenAI APIs including chat models, embeddings, image generation, audio transcription, and moderation.
A comprehensive Java library for integrating OpenAI's powerful AI capabilities into applications through the LangChain4j framework. This module provides unified access to OpenAI's complete API suite including GPT-4o and GPT-4 chat models, text embeddings, DALL-E image generation, Whisper audio transcription, and content moderation capabilities.
The integration supports advanced features like streaming responses, structured JSON outputs with schema validation, tool/function calling with parallel execution, reasoning capabilities for o1/o3 models, prompt caching, and comprehensive token usage tracking. All models follow consistent builder patterns and support extensive configuration including retry logic, custom HTTP settings, and observability through listeners.
pom.xml:<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.11.0</version>
</dependency>Or Gradle:
implementation 'dev.langchain4j:langchain4j-open-ai:1.11.0'When implementing OpenAI integration, choose components based on your requirements:
| Requirement | Component | File Reference |
|---|---|---|
| Conversational AI with history | OpenAiChatModel | Chat Models |
| Real-time streaming responses | OpenAiStreamingChatModel | Chat Models |
| Simple text completion (legacy) | OpenAiLanguageModel | Language Models |
| Semantic search / RAG | OpenAiEmbeddingModel | Embedding Models |
| Image generation from text | OpenAiImageModel | Image Models |
| Audio to text transcription | OpenAiAudioTranscriptionModel | Audio Transcription |
| Content policy checking | OpenAiModerationModel | Moderation Models |
| Cost estimation before calls | OpenAiTokenCountEstimator | Token Management |
| Query available models | OpenAiModelCatalog | Model Catalog |
// Chat models
import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.openai.OpenAiStreamingChatModel;
import dev.langchain4j.model.openai.OpenAiChatModelName;
// Language models (completion interface)
import dev.langchain4j.model.openai.OpenAiLanguageModel;
import dev.langchain4j.model.openai.OpenAiStreamingLanguageModel;
import dev.langchain4j.model.openai.OpenAiLanguageModelName;
// Embedding models
import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
import dev.langchain4j.model.openai.OpenAiEmbeddingModelName;
// Image generation models
import dev.langchain4j.model.openai.OpenAiImageModel;
import dev.langchain4j.model.openai.OpenAiImageModelName;
// Audio transcription models
import dev.langchain4j.model.openai.OpenAiAudioTranscriptionModel;
import dev.langchain4j.model.openai.OpenAiAudioTranscriptionModelName;
// Moderation models
import dev.langchain4j.model.openai.OpenAiModerationModel;
import dev.langchain4j.model.openai.OpenAiModerationModelName;
// Token management
import dev.langchain4j.model.openai.OpenAiTokenCountEstimator;
import dev.langchain4j.model.openai.OpenAiTokenUsage;
// Request and response metadata
import dev.langchain4j.model.openai.OpenAiChatRequestParameters;
import dev.langchain4j.model.openai.OpenAiChatResponseMetadata;
// Model catalog
import dev.langchain4j.model.openai.OpenAiModelCatalog;
// Core LangChain4j types
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.chat.StreamingChatLanguageModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.audio.AudioTranscriptionRequest;
import dev.langchain4j.model.audio.AudioTranscriptionResponse;
import dev.langchain4j.model.catalog.ModelDescription;
import dev.langchain4j.model.output.Response;import dev.langchain4j.model.openai.OpenAiChatModel;
import dev.langchain4j.model.openai.OpenAiChatModelName;
// Create a chat model with explicit defaults
OpenAiChatModel model = OpenAiChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY")) // Required
.modelName(OpenAiChatModelName.GPT_4_O) // Recommended: GPT-4o
.temperature(0.7) // Default: 1.0, Range: 0.0-2.0
.build();
// Generate a response
String response = model.generate("What is the capital of France?");
System.out.println(response); // "The capital of France is Paris."import dev.langchain4j.model.openai.OpenAiStreamingChatModel;
import dev.langchain4j.model.StreamingChatResponseHandler;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.data.message.UserMessage;
OpenAiStreamingChatModel streamingModel = OpenAiStreamingChatModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("gpt-4o")
.build();
ChatRequest request = ChatRequest.builder()
.messages(UserMessage.from("Tell me a story"))
.build();
streamingModel.doChat(request, new StreamingChatResponseHandler() {
@Override
public void onNext(String token) {
System.out.print(token); // Print each token as it arrives
}
@Override
public void onComplete(ChatResponse response) {
System.out.println("\nDone!");
System.out.println("Tokens: " + response.tokenUsage().totalTokenCount());
}
@Override
public void onError(Throwable error) {
error.printStackTrace();
}
});import dev.langchain4j.model.openai.OpenAiEmbeddingModel;
import dev.langchain4j.model.openai.OpenAiEmbeddingModelName;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.data.embedding.Embedding;
import java.util.List;
OpenAiEmbeddingModel embeddingModel = OpenAiEmbeddingModel.builder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName(OpenAiEmbeddingModelName.TEXT_EMBEDDING_3_SMALL) // Default: 1536 dimensions
.build();
Response<List<Embedding>> embeddings = embeddingModel.embedAll(
List.of(
TextSegment.from("Hello world"),
TextSegment.from("Goodbye world")
)
);
System.out.println("Generated " + embeddings.content().size() + " embeddings");
// Each embedding is a float[] of length 1536The LangChain4j OpenAI integration is built around a consistent architecture:
All OpenAI models implement standard LangChain4j interfaces:
Every model uses a fluent builder pattern for configuration:
ModelClass.builder() to get a builderthis for chaining)build() to create the model instanceStreaming models use a handler-based approach:
The integration provides comprehensive token tracking:
Provides access to OpenAI's conversational models (GPT-4o, GPT-4, GPT-3.5, o1, o3) with support for multi-turn conversations, system messages, and chat history. Includes both synchronous and streaming interfaces.
public class OpenAiChatModel implements ChatModel {
public static OpenAiChatModelBuilder builder();
public Response<AiMessage> generate(List<ChatMessage> messages);
public ChatResponse doChat(ChatRequest chatRequest);
public OpenAiChatRequestParameters defaultRequestParameters();
public Set<Capability> supportedCapabilities();
public List<ChatModelListener> listeners();
public ModelProvider provider();
}public class OpenAiStreamingChatModel implements StreamingChatModel {
public static OpenAiStreamingChatModelBuilder builder();
public void generate(List<ChatMessage> messages, StreamingResponseHandler<AiMessage> handler);
public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler);
public OpenAiChatRequestParameters defaultRequestParameters();
public List<ChatModelListener> listeners();
public ModelProvider provider();
}Legacy completion interface for models like gpt-3.5-turbo-instruct. Supports simple text-to-text completion without conversation context. Recommended to use Chat Models for most use cases.
public class OpenAiLanguageModel implements LanguageModel {
public static OpenAiLanguageModelBuilder builder();
public Response<String> generate(String prompt);
public String modelName();
}public class OpenAiStreamingLanguageModel implements StreamingLanguageModel {
public static OpenAiStreamingLanguageModelBuilder builder();
public void generate(String prompt, StreamingResponseHandler<String> handler);
public String modelName();
}Generate dense vector representations of text for semantic search, clustering, and similarity comparisons. Supports text-embedding-3-small, text-embedding-3-large, and text-embedding-ada-002.
public class OpenAiEmbeddingModel extends DimensionAwareEmbeddingModel {
public static OpenAiEmbeddingModelBuilder builder();
public Response<Embedding> embed(String text);
public Response<List<Embedding>> embedAll(List<TextSegment> textSegments);
public Integer knownDimension();
public String modelName();
}Generate artistic images from text descriptions using DALL-E 2 or DALL-E 3. Supports various sizes, quality levels, and artistic styles.
public class OpenAiImageModel implements ImageModel {
public static OpenAiImageModelBuilder builder();
public Response<Image> generate(String prompt);
public Response<List<Image>> generate(String prompt, int n);
public String modelName();
}Convert audio to text using Whisper and GPT-4o audio models. Supports multiple audio formats and optional speaker diarization.
public class OpenAiAudioTranscriptionModel implements AudioTranscriptionModel {
public static Builder builder();
public AudioTranscriptionResponse transcribe(AudioTranscriptionRequest audioRequest);
public ModelProvider provider();
}Analyze text content for policy violations including hate speech, violence, self-harm, and sexual content. Returns binary flagged status.
public class OpenAiModerationModel implements ModerationModel {
public static OpenAiModerationModelBuilder builder();
public Response<Moderation> moderate(String text);
public Response<Moderation> moderate(List<ChatMessage> messages);
public String modelName();
}Estimate token usage and costs before making API calls. Provides detailed token usage information including cached tokens and reasoning tokens.
public class OpenAiTokenCountEstimator implements TokenCountEstimator {
public OpenAiTokenCountEstimator(String modelName);
public OpenAiTokenCountEstimator(OpenAiChatModelName modelName);
public OpenAiTokenCountEstimator(OpenAiEmbeddingModelName modelName);
public OpenAiTokenCountEstimator(OpenAiLanguageModelName modelName);
public int estimateTokenCountInText(String text);
public int estimateTokenCountInMessage(ChatMessage message);
public int estimateTokenCountInMessages(Iterable<ChatMessage> messages);
public List<Integer> encode(String text);
public List<Integer> encode(String text, int maxTokensToEncode);
public String decode(List<Integer> tokens);
}public class OpenAiTokenUsage extends TokenUsage {
public static Builder builder();
public Integer inputTokenCount();
public Integer outputTokenCount();
public Integer totalTokenCount();
public InputTokensDetails inputTokensDetails();
public OutputTokensDetails outputTokensDetails();
public OpenAiTokenUsage add(TokenUsage other);
}OpenAI-specific extensions to standard LangChain4j request parameters and response metadata, providing access to advanced features like reasoning effort, service tiers, and detailed token breakdowns.
public class OpenAiChatRequestParameters extends DefaultChatRequestParameters {
public static Builder builder();
public Integer maxCompletionTokens();
public Map<String, Integer> logitBias();
public Boolean parallelToolCalls();
public Integer seed();
public String user();
public Boolean store();
public Map<String, String> metadata();
public String serviceTier();
public String reasoningEffort();
public Map<String, Object> customParameters();
public OpenAiChatRequestParameters overrideWith(ChatRequestParameters other);
public OpenAiChatRequestParameters defaultedBy(ChatRequestParameters defaults);
}public class OpenAiChatResponseMetadata extends ChatResponseMetadata {
public static Builder builder();
public String id();
public String modelName();
public OpenAiTokenUsage tokenUsage();
public FinishReason finishReason();
public Long created();
public String serviceTier();
public String systemFingerprint();
}Query available OpenAI models and their capabilities through the API.
public class OpenAiModelCatalog implements ModelCatalog {
public static Builder builder();
public List<ModelDescription> listModels();
public ModelProvider provider();
}Experimental and advanced capabilities including the OpenAI Responses API for prompt caching, SPI factories for custom builder creation, and internal utilities.
public class OpenAiResponsesStreamingChatModel implements StreamingChatModel {
public static Builder builder();
public void doChat(ChatRequest chatRequest, StreamingChatResponseHandler handler);
public ChatRequestParameters defaultRequestParameters();
public List<ChatModelListener> listeners();
public ModelProvider provider();
}// Chat model names
enum OpenAiChatModelName {
GPT_3_5_TURBO, // gpt-3.5-turbo (default snapshot)
GPT_4, // gpt-4 (default snapshot)
GPT_4_TURBO, // gpt-4-turbo (default snapshot)
GPT_4_O, // gpt-4o (default snapshot)
GPT_4_O_MINI, // gpt-4o-mini (default snapshot)
O1, // o1 (reasoning model)
O3, // o3 (reasoning model)
O3_MINI, // o3-mini (reasoning model)
O4_MINI, // o4-mini (reasoning model)
GPT_4_1, // gpt-4.1 (default snapshot)
GPT_4_1_MINI, // gpt-4.1-mini (default snapshot)
GPT_4_1_NANO, // gpt-4.1-nano (default snapshot)
GPT_5, // gpt-5 (default snapshot)
GPT_5_MINI; // gpt-5-mini (default snapshot)
String toString(); // Returns model ID string
}
// Embedding model names
enum OpenAiEmbeddingModelName {
TEXT_EMBEDDING_3_SMALL, // 1536 dimensions (default), configurable down to 256
TEXT_EMBEDDING_3_LARGE, // 3072 dimensions (default), configurable down to 256
TEXT_EMBEDDING_ADA_002; // 1536 dimensions (fixed)
String toString(); // Returns model ID string
Integer dimension(); // Returns default dimension
static Integer knownDimension(String modelName); // Get dimension by model name
}
// Language model names
enum OpenAiLanguageModelName {
GPT_3_5_TURBO_INSTRUCT; // gpt-3.5-turbo-instruct (legacy completion)
String toString();
}
// Image model names
enum OpenAiImageModelName {
DALL_E_2, // dall-e-2 (lower quality, multiple images)
DALL_E_3; // dall-e-3 (higher quality, single image)
String toString();
}
// Moderation model names
enum OpenAiModerationModelName {
TEXT_MODERATION_STABLE, // Frozen, consistent version
TEXT_MODERATION_LATEST, // Updates over time, best accuracy
OMNI_MODERATION_LATEST, // Supports text + images
OMNI_MODERATION_2024_09_26; // Frozen version at specific date
String toString();
}
// Audio transcription model names
enum OpenAiAudioTranscriptionModelName {
WHISPER_1, // whisper-1 (general purpose)
GPT_4_O_TRANSCRIBE, // gpt-4o-transcribe (enhanced accuracy)
GPT_4_O_MINI_TRANSCRIBE, // gpt-4o-mini-transcribe (fast, good quality)
GPT_4_O_TRANSCRIBE_DIARIZE; // gpt-4o-transcribe-diarize (speaker identification)
String toString();
}// Chat messages
interface ChatMessage {
ChatMessageType type(); // USER, ASSISTANT, SYSTEM, TOOL_EXECUTION_RESULT
String text();
}
class UserMessage implements ChatMessage {
public static UserMessage from(String text);
public static UserMessage from(String name, String text);
public static UserMessage from(String text, List<Content> contents);
}
class AiMessage implements ChatMessage {
public static AiMessage from(String text);
public static AiMessage from(ToolExecutionRequest toolExecutionRequest);
public String text();
public boolean hasToolExecutionRequests();
public List<ToolExecutionRequest> toolExecutionRequests();
}
class SystemMessage implements ChatMessage {
public static SystemMessage from(String text);
}
// Response wrapper
class Response<T> {
public T content(); // The actual response content
public TokenUsage tokenUsage(); // Token usage information
public FinishReason finishReason(); // Why generation stopped
}
// Token usage
class TokenUsage {
public Integer inputTokenCount(); // Tokens in prompt
public Integer outputTokenCount(); // Tokens in response
public Integer totalTokenCount(); // Sum of input + output
}
// Embeddings
class Embedding {
public float[] vector(); // Dense vector representation
public List<Float> vectorAsList(); // Vector as list
public int dimension(); // Vector dimensionality
}
// Text segments
class TextSegment {
public static TextSegment from(String text);
public static TextSegment from(String text, Metadata metadata);
public String text();
public Metadata metadata();
}
// Images
class Image {
public URI url(); // URL to generated image (expires after 1 hour)
public String base64Data(); // Base64-encoded image data
public String revisedPrompt(); // AI-revised version of prompt (DALL-E 3)
}
// Audio
class AudioTranscriptionRequest {
public byte[] audioData(); // Audio file bytes
public String fileName(); // File name with extension
public String language(); // ISO-639-1 code (e.g., "en")
public String prompt(); // Context hint
public Double temperature(); // Sampling temperature (0.0-1.0)
public String responseFormat(); // "json", "text", "srt", "verbose_json", "vtt"
}
class AudioTranscriptionResponse {
public String text(); // Transcribed text
}
// Moderation
class Moderation {
public boolean flagged(); // true if content violates policy
public String flaggedText(); // The specific flagged text (null if not flagged)
}
// Model provider
enum ModelProvider {
OPEN_AI; // OpenAI provider identifier
}
// Finish reasons
enum FinishReason {
STOP, // Natural completion
LENGTH, // Max tokens reached
TOOL_EXECUTION, // Tool call made
CONTENT_FILTER, // Content filtered
OTHER; // Other reason
}
// Capabilities
enum Capability {
RESPONSE_FORMAT_JSON_SCHEMA, // Structured JSON with schema
RESPONSE_FORMAT_TEXT, // Plain text response
THINKING; // Reasoning/thinking capability (o1/o3)
}interface StreamingResponseHandler<T> {
/**
* Called when a new token is received.
* @param token The token text
*/
void onNext(String token);
/**
* Called when generation is complete.
* @param response The complete response with metadata
*/
void onComplete(Response<T> response);
/**
* Called when an error occurs during generation.
* @param error The error that occurred
*/
void onError(Throwable error);
}
interface StreamingChatResponseHandler {
/**
* Called when a new token is received.
* @param token The token text
*/
void onNext(String token);
/**
* Called periodically with partial accumulated response.
* @param partialResponse Partial response with accumulated content
*/
void onPartialResponse(ChatResponse partialResponse);
/**
* Called when generation is complete.
* @param response The complete response with metadata
*/
void onComplete(ChatResponse response);
/**
* Called when an error occurs during generation.
* @param error The error that occurred
*/
void onError(Throwable error);
}// HTTP client builder from langchain4j-http-client module
interface HttpClientBuilder {
HttpClientBuilder connectTimeout(Duration timeout);
HttpClientBuilder readTimeout(Duration timeout);
HttpClientBuilder proxy(Proxy proxy);
HttpClientBuilder sslContext(SSLContext sslContext);
HttpClient build();
}Authentication Errors (401):
try {
OpenAiChatModel model = OpenAiChatModel.builder()
.apiKey("invalid-key")
.build();
model.generate("test");
} catch (Exception e) {
// Handle: Check API key, verify it's not expired
System.err.println("Authentication failed: " + e.getMessage());
}Rate Limit Errors (429):
// Automatic retry is built-in (default: 2 retries with exponential backoff)
OpenAiChatModel model = OpenAiChatModel.builder()
.apiKey(apiKey)
.maxRetries(5) // Increase retries for rate limits
.build();Context Length Exceeded (400):
// Use token estimator to validate before calling
OpenAiTokenCountEstimator estimator = new OpenAiTokenCountEstimator(OpenAiChatModelName.GPT_4_O);
int tokens = estimator.estimateTokenCountInMessages(messages);
if (tokens > 128000) { // GPT-4o context window
// Truncate or summarize messages
}Content Policy Violation (400):
// Use moderation model to check before calling
OpenAiModerationModel moderationModel = OpenAiModerationModel.builder()
.apiKey(apiKey)
.build();
if (moderationModel.moderate(content).content().flagged()) {
// Content violates policy, reject it
}| Operation | Typical Latency | Notes |
|---|---|---|
| Chat completion (streaming) | 50-200ms first token | Subsequent tokens: 10-50ms |
| Chat completion (sync) | 1-5 seconds | Depends on response length |
| Embedding (single) | 50-200ms | Batch is more efficient |
| Embedding (batch of 100) | 200-500ms | Use batching for best throughput |
| Image generation (DALL-E 2) | 10-30 seconds | Size dependent |
| Image generation (DALL-E 3) | 20-60 seconds | Quality dependent |
| Audio transcription | 10-30% of audio duration | Diarization adds 50% |
| Moderation | 100-500ms | Very fast |
Use appropriate model tiers:
Enable caching for repeated prompts:
Batch operations:
API Key Management:
Content Filtering:
Data Privacy:
store(false) to prevent conversation storageuser parameter for per-user trackingmaxRetries appropriatelyOpenAiTokenCountEstimator to validate inputsonError()Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-open-ai