tessl/maven-dev-langchain4j--langchain4j-azure-open-ai

LangChain4j integration for Azure OpenAI providing chat, streaming, embeddings, image generation, audio transcription, and token counting capabilities

Overview

Eval results

Files

Image Model

Name: tessl/maven-dev-langchain4j--langchain4j-azure-open-ai
Author: tessl

The image model generates images from text prompts using Azure-hosted DALL-E models. Supports customization of quality, size, and style.

Imports

import dev.langchain4j.model.azure.AzureOpenAiImageModel;
import dev.langchain4j.model.azure.AzureOpenAiImageModelName;
import dev.langchain4j.data.image.Image;
import dev.langchain4j.model.output.Response;
import com.azure.ai.openai.models.ImageGenerationQuality;
import com.azure.ai.openai.models.ImageSize;
import com.azure.ai.openai.models.ImageGenerationStyle;
import com.azure.ai.openai.models.ImageGenerationResponseFormat;

Basic Usage

AzureOpenAiImageModel model = AzureOpenAiImageModel.builder()
    .endpoint("https://your-resource.openai.azure.com/")
    .apiKey("your-api-key")
    .deploymentName("dall-e-3")
    .serviceVersion("2024-02-15-preview")
    .size("1024x1024")
    .quality("hd")
    .style("vivid")
    .build();

// Generate an image
Response<Image> response = model.generate("A serene mountain landscape at sunset");
Image image = response.content();

// Get image URL (expires in 1 hour) or base64 data
String imageUrl = image.url();  // https://...
String base64Data = image.base64Data();  // null if responseFormat is URL
String revisedPrompt = image.revisedPrompt();  // Model's interpretation

API

package dev.langchain4j.model.azure;

/**
 * Azure OpenAI image generation using DALL-E models.
 * Thread-safe: Yes - instances are immutable and thread-safe.
 * Generation limit: 1 image per request (DALL-E 3 limitation).
 * Prompt limit: Maximum 4000 characters.
 * URL expiration: Generated image URLs expire after 1 hour.
 * Timeout: Recommend 90-120 seconds (image generation is slow).
 */
class AzureOpenAiImageModel implements dev.langchain4j.model.image.ImageModel {
    static Builder builder();

    /**
     * Generates image from text prompt.
     * @param prompt Text description, 1-4000 characters
     * @return Response with Image (URL or base64) and revised prompt
     * @throws dev.langchain4j.exception.ContentFilteredException if prompt violates policy (not retried)
     * @throws IllegalArgumentException if prompt is null, empty, or > 4000 chars (not retried)
     * @throws java.util.concurrent.TimeoutException if generation exceeds timeout (retried per policy)
     * @throws RuntimeException for network/API errors (retry depends on status)
     */
    dev.langchain4j.model.output.Response<dev.langchain4j.data.image.Image> generate(String prompt);

    class Builder {
        // Mandatory
        Builder endpoint(String endpoint);
        Builder serviceVersion(String serviceVersion);
        Builder deploymentName(String deploymentName);

        // Authentication
        Builder apiKey(String apiKey);
        Builder nonAzureApiKey(String apiKey);
        Builder tokenCredential(com.azure.core.credential.TokenCredential credential);

        // Image parameters
        /**
         * Image quality.
         * @param quality "standard" or "hd"
         * @default "standard"
         */
        Builder quality(String quality);
        Builder quality(com.azure.ai.openai.models.ImageGenerationQuality quality);

        /**
         * Image dimensions.
         * @param size "1024x1024", "1792x1024", or "1024x1792"
         * @default "1024x1024"
         */
        Builder size(String size);
        Builder size(com.azure.ai.openai.models.ImageSize size);

        /**
         * Image style.
         * @param style "vivid" (dramatic) or "natural" (realistic)
         * @default "vivid"
         */
        Builder style(String style);
        Builder style(com.azure.ai.openai.models.ImageGenerationStyle style);

        /**
         * Response format.
         * @param responseFormat "url" or "b64_json"
         * @default "url" (URL expires in 1 hour)
         */
        Builder responseFormat(String responseFormat);
        Builder responseFormat(com.azure.ai.openai.models.ImageGenerationResponseFormat responseFormat);

        /**
         * End-user identifier.
         * @param user User ID for abuse monitoring
         * @default null
         */
        Builder user(String user);

        // HTTP configuration
        /**
         * @default 120 seconds (longer for image generation)
         */
        Builder timeout(java.time.Duration timeout);
        Builder maxRetries(Integer maxRetries);
        Builder retryOptions(com.azure.core.http.policy.RetryOptions retryOptions);
        Builder proxyOptions(com.azure.core.http.ProxyOptions proxyOptions);
        Builder httpClientProvider(com.azure.core.http.HttpClientProvider httpClientProvider);
        Builder openAIClient(com.azure.ai.openai.OpenAIClient client);
        Builder customHeaders(java.util.Map<String, String> customHeaders);
        Builder userAgentSuffix(String userAgentSuffix);
        Builder logRequestsAndResponses(Boolean logRequestsAndResponses);

        AzureOpenAiImageModel build();
    }
}

Configuration

Quality

// Standard quality: Faster, cheaper, good for drafts
.quality("standard")  // or ImageGenerationQuality.STANDARD

// HD quality: Slower, more expensive, higher detail
.quality("hd")  // or ImageGenerationQuality.HD

Size

// Square format (most common, cheapest)
.size("1024x1024")  // or ImageSize.SIZE_1024_X_1024

// Landscape format
.size("1792x1024")  // or ImageSize.SIZE_1792_X_1024

// Portrait format
.size("1024x1792")  // or ImageSize.SIZE_1024_X_1792

Style

// Vivid: Hyper-real, dramatic, vibrant (default)
.style("vivid")  // or ImageGenerationStyle.VIVID

// Natural: More realistic, less dramatic
.style("natural")  // or ImageGenerationStyle.NATURAL

Response Format

// URL format: Returns URL (expires in 1 hour)
.responseFormat("url")  // or ImageGenerationResponseFormat.URL

// Base64 format: Returns base64-encoded image data
.responseFormat("b64_json")  // or ImageGenerationResponseFormat.B64_JSON

Model Names

enum AzureOpenAiImageModelName {
    DALL_E_3,      // dall-e-3 (latest)
    DALL_E_3_30;   // dall-e-3-30 (version 30)

    String modelName();
}

Types

package dev.langchain4j.data.image;

/**
 * Generated image representation.
 */
class Image {
    /**
     * Creates image from URL.
     * @param url Image URL (expires after 1 hour)
     */
    static Image from(String url);

    /**
     * Creates image from base64 data.
     * @param base64Data Base64-encoded image
     */
    static Image fromBase64(String base64Data);

    /**
     * Image URL if responseFormat is URL.
     * @return URL string or null
     */
    String url();

    /**
     * Base64 image data if responseFormat is B64_JSON.
     * @return Base64 string or null
     */
    String base64Data();

    /**
     * Model's revised/interpreted prompt.
     * DALL-E 3 may revise prompts for safety and quality.
     * @return Revised prompt used for generation
     */
    String revisedPrompt();
}

Prompt Engineering

Be Specific

// Poor: Vague, generic
"a dog"

// Better: Specific details
"a golden retriever puppy sitting in a sunlit garden, photorealistic style, shallow depth of field"

Include Style

"a mountain landscape, oil painting style, impressionist"
"a futuristic city, digital art, cyberpunk aesthetic, neon lighting"
"a vintage travel poster, 1950s advertising style, bold colors"

Specify Composition

"close-up portrait of a cat, shallow depth of field, soft lighting"
"wide-angle view of a beach at sunset, golden hour lighting, long shadows"
"bird's eye view of a city intersection, high contrast, geometric patterns"

Error Handling

try {
    Response<Image> response = model.generate(prompt);
} catch (dev.langchain4j.exception.ContentFilteredException e) {
    // Prompt violated content policy (violence, hate, sexual, child safety)
    System.err.println("Content filtered: " + e.getMessage());
    // Not retried - user must modify prompt
} catch (IllegalArgumentException e) {
    // Invalid prompt: null, empty, or > 4000 characters
    System.err.println("Invalid prompt: " + e.getMessage());
} catch (java.util.concurrent.TimeoutException e) {
    // Generation exceeded timeout (default 120s)
    // Retried per policy
    System.err.println("Request timed out");
} catch (RuntimeException e) {
    // Network or API error
    System.err.println("Error: " + e.getMessage());
}

Content Policy

Azure OpenAI blocks prompts containing:

Violence and gore
Hate symbols or imagery
Sexual or adult content
Child safety violations
Public figure likenesses without consent

Limitations

Images per request: 1 image only (DALL-E 3)
URL expiration: 1 hour for URL format
Prompt length: 4000 characters maximum
Rate limits: Subject to Azure OpenAI rate limits
Generation time: 10-30 seconds typical
Revisions: Model may revise unsafe prompts

Cost Optimization

// Most cost-effective
.quality("standard")
.size("1024x1024")

// More expensive
.quality("hd")
.size("1792x1024")  // or 1024x1792

Pricing factors:

Quality: HD costs more than standard
Size: Larger images cost more
Model: DALL-E 3 pricing per image

Optimization tips:

Use standard quality for drafts
Use 1024x1024 for most use cases
Cache/save generated images
Use revised prompt for regeneration if needed

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-azure-open-ai@1.11.0

docs

audio-transcription.md