CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-azure-openai

Spring AI integration for Azure OpenAI services providing chat completion, text embeddings, image generation, and audio transcription with GPT, DALL-E, and Whisper models

Overview
Eval results
Files

image-api.mddocs/reference/

Image Generation

The image generation API creates images from text descriptions using Azure OpenAI's DALL-E models.

Imports

import org.springframework.ai.azure.openai.AzureOpenAiImageModel;
import org.springframework.ai.azure.openai.AzureOpenAiImageOptions;
import org.springframework.ai.azure.openai.metadata.AzureOpenAiImageGenerationMetadata;
import org.springframework.ai.azure.openai.metadata.AzureOpenAiImageResponseMetadata;
import org.springframework.ai.image.ImagePrompt;
import org.springframework.ai.image.ImageResponse;
import org.springframework.ai.image.ImageMessage;
import org.springframework.ai.image.Image;
import com.azure.ai.openai.OpenAIClient;
import com.azure.ai.openai.OpenAIClientBuilder;
import com.azure.core.credential.AzureKeyCredential;

AzureOpenAiImageModel

The main class for generating images.

Thread Safety

Thread-Safe: AzureOpenAiImageModel is fully thread-safe and can be safely used across multiple threads concurrently. A single instance can handle multiple concurrent image generation requests.

Recommendation: Create one instance and reuse it across your application rather than creating new instances for each request.

Construction

class AzureOpenAiImageModel implements ImageModel {
    AzureOpenAiImageModel(OpenAIClient openAIClient);

    AzureOpenAiImageModel(
        OpenAIClient microsoftOpenAiClient,
        AzureOpenAiImageOptions options
    );
}

Parameters:

  • openAIClient: Azure OpenAI client instance (required, non-null, throws NullPointerException if null)
  • options: Default image generation options (optional, uses model defaults if null)

Example:

OpenAIClient openAIClient = new OpenAIClientBuilder()
    .credential(new AzureKeyCredential(apiKey))
    .endpoint(endpoint)
    .buildClient();

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .width(1024)
    .height(1024)
    .build();

AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(openAIClient, options);

Core Methods

Generate Image

ImageResponse call(ImagePrompt imagePrompt);

Generate one or more images from a text prompt.

Parameters:

  • imagePrompt: The prompt containing the description and optional options (non-null, throws NullPointerException if null)

Returns: ImageResponse containing generated images and metadata (never null)

Throws:

  • HttpResponseException: HTTP errors from Azure API (400, 401, 403, 429, 500)
  • ResourceNotFoundException: Deployment not found (404)
  • NonTransientAiException: Permanent failures (invalid parameters, content filter)
  • TransientAiException: Temporary failures (rate limits, timeouts)
  • NullPointerException: If imagePrompt is null
  • IllegalArgumentException: If prompt text is null or empty, or invalid dimension combinations

Constraints:

  • Prompt text cannot be null or empty
  • Prompt text max length: 4000 characters (DALL-E 3), 1000 characters (DALL-E 2)
  • Image dimensions must be valid combinations for the model
  • DALL-E 3 can only generate 1 image per request (N must be 1)
  • DALL-E 2 can generate 1-10 images per request

Example - Basic Usage:

ImagePrompt prompt = new ImagePrompt("A futuristic city at sunset");
ImageResponse response = imageModel.call(prompt);

Image image = response.getResult().getOutput();
String imageUrl = image.getUrl();
System.out.println("Generated image URL: " + imageUrl);

Example - With Options:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .width(1792)
    .height(1024)
    .style("vivid")
    .build();
options.setQuality("hd");

ImagePrompt prompt = new ImagePrompt("A serene mountain landscape", options);
ImageResponse response = imageModel.call(prompt);

Example - Multiple Images (DALL-E 2 only):

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .model("dall-e-2")
    .N(4)  // Generate 4 images
    .build();

ImagePrompt prompt = new ImagePrompt("Abstract art with vibrant colors", options);
ImageResponse response = imageModel.call(prompt);

for (Image image : response.getResults().stream()
        .map(generation -> generation.getOutput())
        .toList()) {
    System.out.println("Image URL: " + image.getUrl());
}

Error Handling:

try {
    ImageResponse response = imageModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 400) {
        if (e.getMessage().contains("content_policy_violation")) {
            throw new ContentFilterException("Prompt blocked by content filter", e);
        } else if (e.getMessage().contains("invalid_image_size")) {
            throw new InvalidParameterException("Invalid image dimensions", e);
        }
    } else if (e.getResponse().getStatusCode() == 429) {
        throw new RateLimitException("Rate limit exceeded", e);
    }
} catch (IllegalArgumentException e) {
    throw new InvalidParameterException("Invalid prompt or options: " + e.getMessage(), e);
}

Get Default Options

AzureOpenAiImageOptions getDefaultOptions();

Retrieve the default options configured for the model.

Returns: AzureOpenAiImageOptions or null if no defaults configured

AzureOpenAiImageOptions

Configuration class for image generation requests.

Construction

class AzureOpenAiImageOptions implements ImageOptions {
    static Builder builder();
}

Builder

class Builder {
    Builder N(Integer n);
    Builder model(String model);
    Builder deploymentName(String deploymentName);
    Builder responseFormat(String responseFormat);
    Builder width(Integer width);
    Builder height(Integer height);
    Builder user(String user);
    Builder style(String style);
    AzureOpenAiImageOptions build();
}

Builder Methods:

  • All builder methods return this for fluent chaining (never null)
  • All parameters are optional (can be null)
  • build(): Returns non-null AzureOpenAiImageOptions instance

Properties

Number of Images

Integer getN();
void setN(Integer n);

Number of images to generate. Only supported by DALL-E 2 (max 10). DALL-E 3 only supports 1 image.

Constraints:

  • DALL-E 3: Must be 1 or null (throws IllegalArgumentException if > 1)
  • DALL-E 2: 1-10 (throws IllegalArgumentException if < 1 or > 10)
  • Default: 1
  • Type: Integer (nullable)

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .model("dall-e-2")
    .N(3)
    .build();

Model / Deployment Name

String getModel();
void setModel(String model);
String getDeploymentName();
void setDeploymentName(String deploymentName);

Specifies which DALL-E model to use.

Constraints:

  • Cannot be null or empty (throws IllegalArgumentException)
  • Must match an existing deployment in your Azure OpenAI resource
  • Default: "dall-e-3"

Available Models:

  • "dall-e-3": Latest model, best quality, more expensive
  • "dall-e-2": Previous generation, good quality, more affordable

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .build();

Image Dimensions

Integer getWidth();
void setWidth(Integer width);
Integer getHeight();
void setHeight(Integer height);

Dimensions of the generated image in pixels.

DALL-E 3 Supported Sizes:

  • 1024 × 1024 (square)
  • 1792 × 1024 (landscape)
  • 1024 × 1792 (portrait)

DALL-E 2 Supported Sizes:

  • 256 × 256
  • 512 × 512
  • 1024 × 1024

Constraints:

  • Must use supported dimension combinations (throws IllegalArgumentException for invalid combinations)
  • Both width and height must be specified together or both null
  • Cannot mix DALL-E 3 sizes with DALL-E 2 deployment or vice versa

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .width(1792)
    .height(1024)
    .build();

Size (Alternative to Width/Height)

String getSize();
void setSize(String size);

Alternative way to specify image dimensions as a string.

Valid Values:

  • DALL-E 3: "1024x1024", "1792x1024", "1024x1792"
  • DALL-E 2: "256x256", "512x512", "1024x1024"

Constraints:

  • Format must be "WIDTHxHEIGHT" (case-sensitive)
  • Mutually exclusive with width/height parameters

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .build();

options.setSize("1792x1024");

Response Format

String getResponseFormat();
void setResponseFormat(String responseFormat);

Format of the generated image data.

Values:

  • "url": Returns a URL to the generated image (default)
  • "b64_json": Returns base64-encoded JSON

Constraints:

  • Default: "url"
  • Type: String (nullable)
  • URLs expire after 1 hour
  • b64_json useful for immediate processing or storage

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .responseFormat("b64_json")
    .build();

ImageResponse response = imageModel.call(new ImagePrompt("A sunset", options));
String base64Data = response.getResult().getOutput().getB64Json();

Quality

String getQuality();
void setQuality(String quality);

Image quality level. Only supported by DALL-E 3.

Values:

  • "standard": Standard quality (default)
  • "hd": High definition, more detailed images

Constraints:

  • DALL-E 3 only (ignored by DALL-E 2)
  • Default: "standard"
  • HD quality takes longer to generate and costs more
  • Type: String (nullable)

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .build();
options.setQuality("hd");

Style

String getStyle();
void setStyle(String style);

Visual style of the generated image. Only supported by DALL-E 3.

Values:

  • "vivid": Hyper-real and dramatic images (default)
  • "natural": More natural, less hyper-real images

Constraints:

  • DALL-E 3 only (ignored by DALL-E 2)
  • Default: "vivid"
  • Type: String (nullable)

Style Differences:

  • vivid: High contrast, saturated colors, dramatic lighting
  • natural: Softer tones, realistic lighting, natural appearance

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .style("natural")
    .build();

User Identifier

String getUser();
void setUser(String user);

Optional identifier for the end-user, used for abuse monitoring.

Constraints:

  • Max length: 256 characters
  • Optional (can be null)
  • Type: String (nullable)

Example:

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .user("user-123")
    .build();

Constants

public static final String DEFAULT_IMAGE_MODEL = "dall-e-3";

Enums

ImageModel

enum ImageModel {
    DALL_E_3("dall-e-3"),
    DALL_E_2("dall-e-2");

    String getValue();
}

Example:

String modelName = AzureOpenAiImageOptions.ImageModel.DALL_E_3.getValue();

Metadata Classes

AzureOpenAiImageGenerationMetadata

Metadata for individual image generation.

class AzureOpenAiImageGenerationMetadata implements ImageGenerationMetadata {
    AzureOpenAiImageGenerationMetadata(String revisedPrompt);

    String getRevisedPrompt();
}

The revised prompt shows how DALL-E 3 interpreted and potentially modified the original prompt.

Revised Prompt Behavior:

  • DALL-E 3 automatically enhances prompts for better results
  • May add details about style, lighting, composition
  • May rephrase for clarity or safety
  • Always non-null for DALL-E 3
  • Null for DALL-E 2 (doesn't revise prompts)

Example:

ImageResponse response = imageModel.call(prompt);
ImageGenerationMetadata metadata = response.getResult().getMetadata();

if (metadata instanceof AzureOpenAiImageGenerationMetadata azureMetadata) {
    String revisedPrompt = azureMetadata.getRevisedPrompt();
    System.out.println("Revised prompt: " + revisedPrompt);
}

AzureOpenAiImageResponseMetadata

Metadata for the overall image response.

class AzureOpenAiImageResponseMetadata extends ImageResponseMetadata {
    protected AzureOpenAiImageResponseMetadata(Long created);

    Long getCreated();

    static AzureOpenAiImageResponseMetadata from(ImageGenerations openAiImageResponse);
}

Properties:

  • created: Unix timestamp when images were generated (seconds since epoch, non-null)

Example:

ImageResponse response = imageModel.call(prompt);
AzureOpenAiImageResponseMetadata metadata =
    (AzureOpenAiImageResponseMetadata) response.getMetadata();

Long timestamp = metadata.getCreated();
System.out.println("Image created at: " + timestamp);

Usage Examples

Basic Image Generation

OpenAIClient client = new OpenAIClientBuilder()
    .credential(new AzureKeyCredential(apiKey))
    .endpoint(endpoint)
    .buildClient();

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .build();

AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(client, options);

ImagePrompt prompt = new ImagePrompt("A majestic eagle soaring over mountains");
ImageResponse response = imageModel.call(prompt);

String imageUrl = response.getResult().getOutput().getUrl();
System.out.println("Generated: " + imageUrl);

High-Definition Image

AzureOpenAiImageOptions hdOptions = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .width(1024)
    .height(1024)
    .style("vivid")
    .build();
hdOptions.setQuality("hd");

ImagePrompt prompt = new ImagePrompt(
    "A detailed cyberpunk street scene at night with neon lights",
    hdOptions
);

ImageResponse response = imageModel.call(prompt);
Image image = response.getResult().getOutput();

Landscape Image

AzureOpenAiImageOptions landscapeOptions = AzureOpenAiImageOptions.builder()
    .width(1792)
    .height(1024)
    .build();

ImagePrompt prompt = new ImagePrompt(
    "A panoramic view of a tropical beach at sunrise",
    landscapeOptions
);

ImageResponse response = imageModel.call(prompt);

Portrait Image

AzureOpenAiImageOptions portraitOptions = AzureOpenAiImageOptions.builder()
    .width(1024)
    .height(1792)
    .build();

ImagePrompt prompt = new ImagePrompt(
    "A professional portrait of a scientist in a laboratory",
    portraitOptions
);

ImageResponse response = imageModel.call(prompt);

Natural Style Image

AzureOpenAiImageOptions naturalOptions = AzureOpenAiImageOptions.builder()
    .style("natural")
    .build();
naturalOptions.setQuality("hd");

ImagePrompt prompt = new ImagePrompt(
    "A realistic photograph of a coffee shop interior",
    naturalOptions
);

ImageResponse response = imageModel.call(prompt);

Multiple Images with DALL-E 2

AzureOpenAiImageOptions multiOptions = AzureOpenAiImageOptions.builder()
    .model("dall-e-2")
    .N(4)
    .width(512)
    .height(512)
    .build();

ImagePrompt prompt = new ImagePrompt("Abstract geometric patterns", multiOptions);
ImageResponse response = imageModel.call(prompt);

System.out.println("Generated " + response.getResults().size() + " images");
for (var result : response.getResults()) {
    System.out.println("URL: " + result.getOutput().getUrl());
}

Base64 Response Format

AzureOpenAiImageOptions base64Options = AzureOpenAiImageOptions.builder()
    .responseFormat("b64_json")
    .build();

ImagePrompt prompt = new ImagePrompt("A colorful sunset", base64Options);
ImageResponse response = imageModel.call(prompt);

String base64Image = response.getResult().getOutput().getB64Json();
byte[] imageBytes = Base64.getDecoder().decode(base64Image);
// Save or process image bytes

Accessing Revised Prompt

ImagePrompt prompt = new ImagePrompt("A cat");
ImageResponse response = imageModel.call(prompt);

// DALL-E 3 may revise the prompt for better results
ImageGenerationMetadata metadata = response.getResult().getMetadata();
if (metadata instanceof AzureOpenAiImageGenerationMetadata azureMetadata) {
    String original = "A cat";
    String revised = azureMetadata.getRevisedPrompt();
    System.out.println("Original: " + original);
    System.out.println("Revised: " + revised);
    // Revised might be: "A realistic photograph of a fluffy domestic cat..."
}

Per-Request Options Override

// Default options for most requests
AzureOpenAiImageOptions defaultOptions = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .width(1024)
    .height(1024)
    .build();
defaultOptions.setQuality("standard");

AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(client, defaultOptions);

// Override for specific request
AzureOpenAiImageOptions hdOptions = AzureOpenAiImageOptions.builder()
    .width(1792)
    .height(1024)
    .build();
hdOptions.setQuality("hd");

ImagePrompt hdPrompt = new ImagePrompt("A detailed landscape", hdOptions);
ImageResponse response = imageModel.call(hdPrompt);

User Tracking

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .user("user-456")
    .build();

ImagePrompt prompt = new ImagePrompt("A robot in a garden", options);
ImageResponse response = imageModel.call(prompt);

Accessing Response Metadata

ImageResponse response = imageModel.call(prompt);

// Image-level metadata
ImageGenerationMetadata imageMetadata = response.getResult().getMetadata();
if (imageMetadata instanceof AzureOpenAiImageGenerationMetadata azureMeta) {
    System.out.println("Revised prompt: " + azureMeta.getRevisedPrompt());
}

// Response-level metadata
if (response.getMetadata() instanceof AzureOpenAiImageResponseMetadata responseMeta) {
    System.out.println("Created at: " + responseMeta.getCreated());
}

Error Handling

Common Exceptions

// Azure SDK exceptions
com.azure.core.exception.HttpResponseException  // HTTP errors (400, 401, 403, 429, 500)
com.azure.core.exception.ResourceNotFoundException  // Deployment not found (404)

// Spring AI exceptions
org.springframework.ai.retry.NonTransientAiException  // Permanent failures
org.springframework.ai.retry.TransientAiException  // Temporary failures (retry-able)

// Java exceptions
java.lang.IllegalArgumentException  // Invalid parameters
java.lang.NullPointerException  // Null required parameters

Exception Scenarios

1. Content Policy Violation (400):

try {
    response = imageModel.call(prompt);
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 400 && 
        e.getMessage().contains("content_policy_violation")) {
        throw new ContentFilterException(
            "Prompt blocked by content policy. Avoid NSFW, violence, or harmful content.", e
        );
    }
}

2. Invalid Image Size (400):

try {
    AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
        .deploymentName("dall-e-3")
        .width(512)  // Invalid for DALL-E 3
        .height(512)
        .build();
    response = imageModel.call(new ImagePrompt("test", options));
} catch (IllegalArgumentException e) {
    throw new InvalidParameterException(
        "Invalid image size for DALL-E 3. Use 1024x1024, 1792x1024, or 1024x1792", e
    );
}

3. Too Many Images Requested (400):

try {
    AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
        .deploymentName("dall-e-3")
        .N(4)  // DALL-E 3 only supports 1
        .build();
} catch (IllegalArgumentException e) {
    throw new InvalidParameterException(
        "DALL-E 3 can only generate 1 image per request. Use DALL-E 2 for multiple images.", e
    );
}

4. Rate Limiting (429):

public ImageResponse generateWithRetry(ImagePrompt prompt) {
    int maxRetries = 3;
    int baseDelayMs = 2000;
    
    for (int attempt = 0; attempt < maxRetries; attempt++) {
        try {
            return imageModel.call(prompt);
        } catch (HttpResponseException e) {
            if (e.getResponse().getStatusCode() == 429 && attempt < maxRetries - 1) {
                int delayMs = baseDelayMs * (1 << attempt);
                Thread.sleep(delayMs);
                continue;
            }
            throw e;
        }
    }
    throw new RuntimeException("Max retries exceeded");
}

5. Prompt Too Long (400):

try {
    String veryLongPrompt = generateLongPrompt(5000);  // Exceeds 4000 char limit
    response = imageModel.call(new ImagePrompt(veryLongPrompt));
} catch (HttpResponseException e) {
    if (e.getResponse().getStatusCode() == 400 && 
        e.getMessage().contains("prompt")) {
        throw new InvalidParameterException(
            "Prompt too long. DALL-E 3 max: 4000 chars, DALL-E 2 max: 1000 chars", e
        );
    }
}

Validation Rules

Parameter Constraints Summary

Deployment Name:

  • Required: Yes (throws NullPointerException if null)
  • Default: "dall-e-3"
  • Format: Non-empty string

N (Number of Images):

  • DALL-E 3: Must be 1 or null
  • DALL-E 2: 1-10
  • Default: 1
  • Type: Integer (nullable)

Width and Height:

  • DALL-E 3: (1024,1024), (1792,1024), (1024,1792)
  • DALL-E 2: (256,256), (512,512), (1024,1024)
  • Must specify both or neither
  • Type: Integer (nullable)

Size:

  • Format: "WIDTHxHEIGHT"
  • Mutually exclusive with width/height
  • Type: String (nullable)

Quality:

  • Values: "standard", "hd"
  • DALL-E 3 only
  • Default: "standard"
  • Type: String (nullable)

Style:

  • Values: "vivid", "natural"
  • DALL-E 3 only
  • Default: "vivid"
  • Type: String (nullable)

Response Format:

  • Values: "url", "b64_json"
  • Default: "url"
  • Type: String (nullable)

User:

  • Max length: 256 characters
  • Type: String (nullable)

Prompt Text:

  • Required: Yes (throws IllegalArgumentException if null or empty)
  • DALL-E 3 max: 4000 characters
  • DALL-E 2 max: 1000 characters

Model Comparison

DALL-E 3

Strengths:

  • Higher quality images
  • Better prompt understanding
  • Automatic prompt enhancement
  • HD quality option
  • Style control (vivid/natural)
  • Better text rendering in images
  • More coherent compositions

Limitations:

  • Only generates 1 image per request
  • Limited size options (1024×1024, 1792×1024, 1024×1792)
  • Higher cost per image
  • Slower generation (typically 10-30 seconds)

Best For: High-quality, detailed images requiring accurate prompt interpretation

Pricing: ~2-4x more expensive than DALL-E 2

DALL-E 2

Strengths:

  • Can generate multiple images (up to 10)
  • More size options
  • Faster generation (typically 5-15 seconds)
  • Lower cost per image
  • Good for experimentation

Limitations:

  • Lower quality than DALL-E 3
  • Less sophisticated prompt understanding
  • No style or quality controls
  • Weaker text rendering
  • May miss nuanced details

Best For: Generating multiple variations, smaller images, cost-sensitive applications

Pricing: More affordable, good for high-volume use cases

Size Guidelines

DALL-E 3 Sizes

// Square (social media posts, avatars, general purpose)
AzureOpenAiImageOptions.builder().width(1024).height(1024).build()

// Landscape (banners, headers, presentations, desktop wallpapers)
AzureOpenAiImageOptions.builder().width(1792).height(1024).build()

// Portrait (mobile screens, story formats, posters)
AzureOpenAiImageOptions.builder().width(1024).height(1792).build()

DALL-E 2 Sizes

// Small (thumbnails, icons, previews)
AzureOpenAiImageOptions.builder().width(256).height(256).build()

// Medium (general use, web content)
AzureOpenAiImageOptions.builder().width(512).height(512).build()

// Large (detailed images, prints)
AzureOpenAiImageOptions.builder().width(1024).height(1024).build()

Prompt Tips

Effective Prompts

  • Be descriptive and specific
  • Include style, mood, and atmosphere
  • Specify composition and perspective
  • Mention colors, lighting, and details
  • Reference art styles or artists (for inspiration, not copying)

Good Examples:

"A serene Japanese garden in autumn with red maple leaves, stone lanterns,
 and a wooden bridge over a koi pond, photographed at golden hour"

"A futuristic cityscape at night with neon signs, flying vehicles,
 and rain-slicked streets, cyberpunk style, dramatic lighting"

"A cozy reading nook with a window seat, warm sunlight, bookshelves,
 and a sleeping cat, watercolor painting style"

Poor Examples:

"A garden"  // Too vague
"Red"  // Not descriptive enough
"Something cool"  // Ambiguous

Style Modifiers

  • "photorealistic", "oil painting", "watercolor", "digital art"
  • "3D render", "sketch", "anime style", "vintage photograph"
  • "minimalist", "detailed", "abstract", "surreal"
  • "Renaissance style", "Art Nouveau", "impressionist"

Lighting and Mood

  • "golden hour", "dramatic lighting", "soft diffused light"
  • "moody", "cheerful", "mysterious", "peaceful"
  • "high contrast", "pastel colors", "vibrant"
  • "rim lighting", "backlighting", "studio lighting"

Composition Keywords

  • "close-up", "wide angle", "bird's eye view", "low angle"
  • "centered composition", "rule of thirds", "symmetrical"
  • "shallow depth of field", "bokeh background"

Quality Keywords for DALL-E 3

  • "highly detailed", "8k resolution", "professional photography"
  • "cinematic", "award-winning", "masterpiece"
  • "intricate details", "sharp focus", "high quality"

Performance Considerations

Model Instance Reuse

Recommended:

// Create once at application startup
@Bean
public AzureOpenAiImageModel imageModel() {
    return new AzureOpenAiImageModel(client, defaultOptions);
}

// Inject and reuse
@Autowired
private AzureOpenAiImageModel imageModel;

Avoid:

// Don't create new instance per request
for (String prompt : prompts) {
    AzureOpenAiImageModel model = new AzureOpenAiImageModel(...);
    model.call(new ImagePrompt(prompt));  // Inefficient
}

Generation Time

DALL-E 3:

  • Standard quality: 10-20 seconds
  • HD quality: 15-30 seconds
  • Larger sizes may take slightly longer

DALL-E 2:

  • 256x256: 5-10 seconds
  • 512x512: 8-12 seconds
  • 1024x1024: 10-15 seconds
  • Multiple images: sequential generation (N × single image time)

Parallel Processing

ExecutorService executor = Executors.newFixedThreadPool(5);
List<CompletableFuture<ImageResponse>> futures = new ArrayList<>();

for (String promptText : prompts) {
    CompletableFuture<ImageResponse> future = CompletableFuture.supplyAsync(
        () -> imageModel.call(new ImagePrompt(promptText)),
        executor
    );
    futures.add(future);
}

// Wait for all images
List<ImageResponse> responses = futures.stream()
    .map(CompletableFuture::join)
    .collect(Collectors.toList());

URL Expiration Handling

Generated image URLs expire after 1 hour:

public void saveGeneratedImage(String prompt) {
    ImageResponse response = imageModel.call(new ImagePrompt(prompt));
    String imageUrl = response.getResult().getOutput().getUrl();
    
    // Download and save immediately (URL expires in 1 hour)
    byte[] imageBytes = downloadImage(imageUrl);
    saveToStorage(imageBytes);
}

public void saveAsBase64(String prompt) {
    AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
        .responseFormat("b64_json")
        .build();
    
    ImageResponse response = imageModel.call(new ImagePrompt(prompt, options));
    String base64Image = response.getResult().getOutput().getB64Json();
    
    // No expiration concerns with base64
    saveBase64ToStorage(base64Image);
}

Troubleshooting

Issue: Content policy violations

Symptoms: 400 error with content_policy_violation

Common Triggers:

  • Violence, weapons, gore
  • Nudity or sexual content
  • Hate symbols or offensive content
  • Illegal activities
  • Public figures or celebrities (sometimes)

Solutions:

  1. Rephrase prompt to avoid sensitive topics
  2. Focus on positive, creative descriptions
  3. Avoid named individuals
  4. Remove potentially offensive keywords

Issue: Poor image quality

Symptoms: Blurry, artifacts, incorrect details

Solutions:

  1. Use DALL-E 3 instead of DALL-E 2
  2. Enable HD quality for DALL-E 3
  3. Be more specific in prompt
  4. Add quality keywords ("highly detailed", "professional")
  5. Use larger image sizes

Issue: Inconsistent results

Symptoms: Images don't match prompt expectations

Solutions:

  1. Be more specific and descriptive
  2. Add style and mood keywords
  3. Check revised prompt (DALL-E 3) to understand interpretation
  4. Use composition keywords
  5. Generate multiple images (DALL-E 2) and select best

Issue: Text in images is garbled

Symptoms: Unreadable text in generated images

Known Limitation: Both DALL-E 2 and DALL-E 3 struggle with text rendering

Workarounds:

  1. Use DALL-E 3 (better but still imperfect)
  2. Keep text simple and short
  3. Specify "clearly readable text"
  4. Add text in post-processing instead
  5. Focus prompt on visual elements, not text

Issue: Slow generation

Symptoms: Images take longer than expected

Causes & Solutions:

  1. HD quality: Use standard quality for faster generation
  2. DALL-E 3: Use DALL-E 2 for faster results
  3. Network latency: Choose closer Azure region
  4. Service load: Retry during off-peak hours

Issue: Rate limiting

Symptoms: 429 errors

Solutions:

  1. Implement exponential backoff retry logic
  2. Reduce request frequency
  3. Request quota increase from Azure
  4. Cache generated images to avoid regeneration

Best Practices

Prompt Engineering

DO:

  • Be specific and descriptive
  • Include style, lighting, mood
  • Use composition keywords
  • Specify desired quality
  • Reference art styles (for inspiration)

DON'T:

  • Use vague or ambiguous terms
  • Include copyrighted characters or brands
  • Request violent or offensive content
  • Expect perfect text rendering
  • Assume first result is best (try variations)

Cost Optimization

Strategies:

  1. Use DALL-E 2 for experimentation, DALL-E 3 for final images
  2. Use standard quality unless HD is essential
  3. Generate smaller sizes when acceptable
  4. Cache and reuse generated images
  5. Implement prompt validation before generation

Quality Optimization

For Best Results:

  1. Use DALL-E 3 with HD quality
  2. Use largest size needed (1792x1024 or 1024x1792)
  3. Include quality keywords in prompt
  4. Be very specific about desired details
  5. Use style="natural" for realistic images, "vivid" for artistic

Storage and Handling

URL-based (default):

  • Download and save within 1 hour
  • Implement error handling for expired URLs
  • Consider using CDN for distribution

Base64-based:

  • Immediate storage without expiration concerns
  • Larger response payload
  • Good for immediate processing
// Download URL-based image
public byte[] downloadAndSave(String imageUrl) {
    byte[] imageBytes = downloadImage(imageUrl);
    String filename = "generated_" + System.currentTimeMillis() + ".png";
    Files.write(Path.of("images/" + filename), imageBytes);
    return imageBytes;
}

// Handle base64 image
public void saveBase64Image(String base64Image) {
    byte[] imageBytes = Base64.getDecoder().decode(base64Image);
    String filename = "generated_" + System.currentTimeMillis() + ".png";
    Files.write(Path.of("images/" + filename), imageBytes);
}

Common Use Cases

Product Mockups

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .style("natural")
    .width(1024)
    .height(1024)
    .build();
options.setQuality("hd");

ImagePrompt prompt = new ImagePrompt(
    "Professional product photography of a sleek smartphone on a minimalist desk, " +
    "soft studio lighting, white background, high-end commercial photography style",
    options
);

Marketing Materials

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .width(1792)
    .height(1024)
    .style("vivid")
    .build();
options.setQuality("hd");

ImagePrompt prompt = new ImagePrompt(
    "Eye-catching banner for summer sale with vibrant tropical colors, " +
    "beach theme, modern graphic design, energetic and appealing",
    options
);

Concept Art

AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
    .deploymentName("dall-e-3")
    .style("vivid")
    .build();
options.setQuality("hd");

ImagePrompt prompt = new ImagePrompt(
    "Concept art of a futuristic underwater city with bioluminescent architecture, " +
    "glass domes, submarines, dramatic lighting, detailed digital art",
    options
);

Social Media Content

// Square format for Instagram
AzureOpenAiImageOptions socialOptions = AzureOpenAiImageOptions.builder()
    .width(1024)
    .height(1024)
    .style("vivid")
    .build();

ImagePrompt prompt = new ImagePrompt(
    "Inspiring motivational quote background with abstract geometric patterns, " +
    "warm gradient colors, modern minimalist design",
    socialOptions
);

Variations with DALL-E 2

// Generate multiple variations quickly
AzureOpenAiImageOptions variantOptions = AzureOpenAiImageOptions.builder()
    .model("dall-e-2")
    .N(4)
    .width(512)
    .height(512)
    .build();

ImagePrompt prompt = new ImagePrompt(
    "Abstract logo design with geometric shapes and blue colors",
    variantOptions
);

ImageResponse response = imageModel.call(prompt);
// Select best variant from 4 options

Default Values

  • Model: "dall-e-3"
  • Size: 1024 × 1024
  • Quality: "standard"
  • Style: "vivid"
  • Response Format: "url"
  • N: 1
tessl i tessl/maven-org-springframework-ai--spring-ai-azure-openai@1.1.1

docs

index.md

tile.json