CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-open-ai

LangChain4j OpenAI Integration providing Java access to OpenAI APIs including chat models, embeddings, image generation, audio transcription, and moderation.

Overview
Eval results
Files

language-models.mddocs/

Language Models

Language models provide OpenAI's legacy completion interface, primarily for models like gpt-3.5-turbo-instruct. Unlike chat models, language models use a simple text-to-text completion format without conversation context or message structure.

For most use cases, the Chat Models interface is recommended as it provides more features and supports newer models. Language models are maintained for backward compatibility and specific completion use cases.

Capabilities

OpenAiLanguageModel

Synchronous language model for text completion tasks. Generates text continuations from a single prompt string.

public class OpenAiLanguageModel implements LanguageModel {
    public static OpenAiLanguageModelBuilder builder();

    // Core generation method
    public Response<String> generate(String prompt);

    // Model information
    public String modelName();
}

OpenAiLanguageModelBuilder

Builder for configuring OpenAiLanguageModel instances with authentication and HTTP settings.

public static class OpenAiLanguageModelBuilder {
    // Core configuration
    public OpenAiLanguageModelBuilder modelName(String modelName);
    public OpenAiLanguageModelBuilder modelName(OpenAiLanguageModelName modelName);
    public OpenAiLanguageModelBuilder baseUrl(String baseUrl);
    public OpenAiLanguageModelBuilder apiKey(String apiKey);
    public OpenAiLanguageModelBuilder organizationId(String organizationId);
    public OpenAiLanguageModelBuilder projectId(String projectId);

    // Generation parameters
    public OpenAiLanguageModelBuilder temperature(Double temperature);

    // HTTP configuration
    public OpenAiLanguageModelBuilder httpClientBuilder(HttpClientBuilder httpClientBuilder);
    public OpenAiLanguageModelBuilder timeout(Duration timeout);
    public OpenAiLanguageModelBuilder maxRetries(Integer maxRetries);
    public OpenAiLanguageModelBuilder customHeaders(Map<String, String> customHeaders);
    public OpenAiLanguageModelBuilder customHeaders(Supplier<Map<String, String>> customHeadersSupplier);
    public OpenAiLanguageModelBuilder customQueryParams(Map<String, String> customQueryParams);

    // Logging
    public OpenAiLanguageModelBuilder logRequests(Boolean logRequests);
    public OpenAiLanguageModelBuilder logResponses(Boolean logResponses);
    public OpenAiLanguageModelBuilder logger(Logger logger);

    // Build
    public OpenAiLanguageModel build();
}

Usage Example

import dev.langchain4j.model.openai.OpenAiLanguageModel;
import dev.langchain4j.model.openai.OpenAiLanguageModelName;
import dev.langchain4j.model.output.Response;
import java.time.Duration;

// Create language model
OpenAiLanguageModel model = OpenAiLanguageModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(OpenAiLanguageModelName.GPT_3_5_TURBO_INSTRUCT)
    .temperature(0.7)
    .timeout(Duration.ofSeconds(30))
    .maxRetries(3)
    .build();

// Simple completion
String prompt = "Once upon a time in a distant galaxy,";
Response<String> response = model.generate(prompt);

System.out.println("Generated text: " + response.content());
System.out.println("Tokens used: " + response.tokenUsage().totalTokenCount());
System.out.println("Finish reason: " + response.finishReason());

// Multi-line prompt
String complexPrompt = """
    Write a function in Python that calculates the Fibonacci sequence:

    def fibonacci(n):
    """;

Response<String> codeCompletion = model.generate(complexPrompt);
System.out.println(codeCompletion.content());

// With logging enabled
OpenAiLanguageModel debugModel = OpenAiLanguageModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(OpenAiLanguageModelName.GPT_3_5_TURBO_INSTRUCT)
    .logRequests(true)
    .logResponses(true)
    .build();

Response<String> debugResponse = debugModel.generate("Complete this sentence: The best thing about");

OpenAiStreamingLanguageModel

Streaming language model that returns text completions incrementally as tokens are generated. Useful for providing real-time feedback in user interfaces.

public class OpenAiStreamingLanguageModel implements StreamingLanguageModel {
    public static OpenAiStreamingLanguageModelBuilder builder();

    // Core streaming method
    public void generate(String prompt, StreamingResponseHandler<String> handler);

    // Model information
    public String modelName();
}

OpenAiStreamingLanguageModelBuilder

Builder for configuring OpenAiStreamingLanguageModel instances.

public static class OpenAiStreamingLanguageModelBuilder {
    // Core configuration
    public OpenAiStreamingLanguageModelBuilder modelName(String modelName);
    public OpenAiStreamingLanguageModelBuilder modelName(OpenAiLanguageModelName modelName);
    public OpenAiStreamingLanguageModelBuilder baseUrl(String baseUrl);
    public OpenAiStreamingLanguageModelBuilder apiKey(String apiKey);
    public OpenAiStreamingLanguageModelBuilder organizationId(String organizationId);
    public OpenAiStreamingLanguageModelBuilder projectId(String projectId);

    // Generation parameters
    public OpenAiStreamingLanguageModelBuilder temperature(Double temperature);

    // HTTP configuration
    public OpenAiStreamingLanguageModelBuilder httpClientBuilder(HttpClientBuilder httpClientBuilder);
    public OpenAiStreamingLanguageModelBuilder timeout(Duration timeout);
    public OpenAiStreamingLanguageModelBuilder maxRetries(Integer maxRetries);
    public OpenAiStreamingLanguageModelBuilder customHeaders(Map<String, String> customHeaders);
    public OpenAiStreamingLanguageModelBuilder customHeaders(Supplier<Map<String, String>> customHeadersSupplier);
    public OpenAiStreamingLanguageModelBuilder customQueryParams(Map<String, String> customQueryParams);

    // Logging
    public OpenAiStreamingLanguageModelBuilder logRequests(Boolean logRequests);
    public OpenAiStreamingLanguageModelBuilder logResponses(Boolean logResponses);
    public OpenAiStreamingLanguageModelBuilder logger(Logger logger);

    // Build
    public OpenAiStreamingLanguageModel build();
}

Usage Example

import dev.langchain4j.model.openai.OpenAiStreamingLanguageModel;
import dev.langchain4j.model.openai.OpenAiLanguageModelName;
import dev.langchain4j.model.StreamingResponseHandler;
import dev.langchain4j.model.output.Response;
import java.util.concurrent.CompletableFuture;
import java.util.concurrent.TimeUnit;

// Create streaming language model
OpenAiStreamingLanguageModel model = OpenAiStreamingLanguageModel.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .modelName(OpenAiLanguageModelName.GPT_3_5_TURBO_INSTRUCT)
    .temperature(0.8)
    .build();

// Simple streaming with inline handler
System.out.print("Story: ");
model.generate("Once upon a time, there was a brave knight who",
    new StreamingResponseHandler<String>() {
        @Override
        public void onNext(String token) {
            System.out.print(token);
        }

        @Override
        public void onComplete(Response<String> response) {
            System.out.println("\n\nGeneration complete!");
            System.out.println("Total tokens: " + response.tokenUsage().totalTokenCount());
        }

        @Override
        public void onError(Throwable error) {
            System.err.println("Error occurred: " + error.getMessage());
        }
    }
);

// Accumulate streaming output
CompletableFuture<String> future = new CompletableFuture<>();
StringBuilder accumulated = new StringBuilder();

model.generate("Write a haiku about programming:",
    new StreamingResponseHandler<String>() {
        @Override
        public void onNext(String token) {
            accumulated.append(token);
            // Update UI or process token
            processToken(token);
        }

        @Override
        public void onComplete(Response<String> response) {
            String fullText = accumulated.toString();
            future.complete(fullText);
        }

        @Override
        public void onError(Throwable error) {
            future.completeExceptionally(error);
        }
    }
);

// Wait for completion and get full text
try {
    String haiku = future.get(30, TimeUnit.SECONDS);
    System.out.println("Complete haiku:\n" + haiku);
} catch (Exception e) {
    e.printStackTrace();
}

// Custom handler for progressive processing
class ProgressiveHandler implements StreamingResponseHandler<String> {
    private final StringBuilder buffer = new StringBuilder();
    private int tokenCount = 0;

    @Override
    public void onNext(String token) {
        buffer.append(token);
        tokenCount++;

        // Process every 10 tokens
        if (tokenCount % 10 == 0) {
            System.out.println("Progress: " + tokenCount + " tokens received");
        }
    }

    @Override
    public void onComplete(Response<String> response) {
        System.out.println("\nFinal text: " + buffer.toString());
        System.out.println("Total tokens: " + response.tokenUsage().totalTokenCount());
    }

    @Override
    public void onError(Throwable error) {
        System.err.println("Failed after " + tokenCount + " tokens: " + error.getMessage());
    }

    public String getCurrentText() {
        return buffer.toString();
    }
}

ProgressiveHandler handler = new ProgressiveHandler();
model.generate("Explain quantum computing in simple terms:", handler);

// Access partial results while streaming
Thread.sleep(1000); // Wait a bit
System.out.println("Partial result: " + handler.getCurrentText());

Model Names

public enum OpenAiLanguageModelName {
    GPT_3_5_TURBO_INSTRUCT("gpt-3.5-turbo-instruct");

    public String toString();
}

Types

LanguageModel Interface

public interface LanguageModel {
    Response<String> generate(String prompt);
}

StreamingLanguageModel Interface

public interface StreamingLanguageModel {
    void generate(String prompt, StreamingResponseHandler<String> handler);
}

StreamingResponseHandler

public interface StreamingResponseHandler<T> {
    /**
     * Called when a new token is received.
     *
     * @param token The token text
     */
    void onNext(String token);

    /**
     * Called when generation is complete.
     *
     * @param response The complete response with metadata
     */
    void onComplete(Response<T> response);

    /**
     * Called when an error occurs during generation.
     *
     * @param error The error that occurred
     */
    void onError(Throwable error);
}

Response

public class Response<T> {
    public T content();
    public TokenUsage tokenUsage();
    public FinishReason finishReason();
}

TokenUsage

public class TokenUsage {
    public Integer inputTokenCount();
    public Integer outputTokenCount();
    public Integer totalTokenCount();
}

FinishReason

public enum FinishReason {
    STOP,        // Natural stop
    LENGTH,      // Max tokens reached
    CONTENT_FILTER,  // Content filtered
    OTHER;       // Other reason
}

Configuration Options

Temperature

Controls randomness in text generation. Range: 0.0 to 2.0

  • 0.0: Deterministic output, always chooses most likely token
  • 0.3-0.7: Focused and coherent, good for factual content
  • 0.7-1.0: Balanced between creativity and coherence (default)
  • 1.0-1.5: More creative and varied output
  • 1.5-2.0: Highly creative, potentially less coherent

Timeout

Maximum time to wait for the API response

  • Default varies by client
  • Set longer for expected long completions
  • Use Duration.ofSeconds(), Duration.ofMinutes(), etc.

Max Retries

Number of retry attempts on failure

  • Default: 2
  • Includes network errors and rate limiting
  • Uses exponential backoff between retries

Base URL

Override the default OpenAI API endpoint

  • Default: https://api.openai.com/v1/
  • Useful for:
    • OpenAI-compatible APIs
    • Local proxy servers
    • Enterprise API gateways
    • Mock servers for testing

Custom Headers

Add custom HTTP headers to requests

  • Static: Map<String, String>
  • Dynamic: Supplier<Map<String, String>>
  • Common use cases:
    • Custom authentication
    • Request tracking headers
    • Proxy authentication

Custom Query Parameters

Add URL query parameters to all requests

  • Format: Map<String, String>
  • Appended to the request URL
  • Useful for API gateways and proxies

Organization ID

Optional OpenAI organization identifier

  • Required for users in multiple organizations
  • Determines billing and rate limits
  • Format: org-...

Project ID

Optional OpenAI project identifier

  • For project-level access control
  • Scopes API usage to specific project
  • Format: proj-...

Differences from Chat Models

Message Structure

  • Language Models: Single string prompt
  • Chat Models: List of structured messages (system, user, assistant)

Conversation Context

  • Language Models: No built-in conversation tracking
  • Chat Models: Maintains conversation history

Tool Calling

  • Language Models: Not supported
  • Chat Models: Full tool/function calling support

Structured Output

  • Language Models: Plain text only
  • Chat Models: JSON schemas, structured formats

Model Availability

  • Language Models: Limited to gpt-3.5-turbo-instruct
  • Chat Models: All modern models (GPT-4o, GPT-4, o1, o3, etc.)

Use Cases

Language Models are best for:

  • Simple text completion tasks
  • Code completion
  • Legacy applications requiring completion API
  • Template filling
  • Text continuation

Chat Models are better for:

  • Conversational AI
  • Multi-turn interactions
  • Tool/function calling
  • Structured output generation
  • Most modern applications

Migration from Language Models to Chat Models

Convert language model code to chat model:

// Old: Language Model
OpenAiLanguageModel oldModel = OpenAiLanguageModel.builder()
    .apiKey(apiKey)
    .modelName(OpenAiLanguageModelName.GPT_3_5_TURBO_INSTRUCT)
    .build();

String result = oldModel.generate("What is the capital of France?");

// New: Chat Model
OpenAiChatModel newModel = OpenAiChatModel.builder()
    .apiKey(apiKey)
    .modelName(OpenAiChatModelName.GPT_3_5_TURBO)
    .build();

String result = newModel.generate("What is the capital of France?");

For streaming:

// Old: Streaming Language Model
OpenAiStreamingLanguageModel oldStream = OpenAiStreamingLanguageModel.builder()
    .apiKey(apiKey)
    .modelName(OpenAiLanguageModelName.GPT_3_5_TURBO_INSTRUCT)
    .build();

oldStream.generate(prompt, handler);

// New: Streaming Chat Model
OpenAiStreamingChatModel newStream = OpenAiStreamingChatModel.builder()
    .apiKey(apiKey)
    .modelName(OpenAiChatModelName.GPT_3_5_TURBO)
    .build();

newStream.generate(prompt, handler);

The chat model interface is backward compatible for simple string prompts, making migration straightforward while providing access to advanced features when needed.

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-open-ai

docs

advanced-features.md

audio-transcription-models.md

chat-models.md

embedding-models.md

image-models.md

index.md

language-models.md

model-catalog.md

moderation-models.md

request-response.md

token-management.md

README.md

tile.json