CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-org-springframework-ai--spring-ai-openai

OpenAI models support for Spring AI, providing comprehensive integration for chat completion, embeddings, image generation, audio transcription, text-to-speech, and content moderation capabilities within Spring Boot applications.

Overview
Eval results
Files

image-models.mddocs/reference/

Image Models

Generate images from text descriptions using OpenAI's DALL-E models, supporting both DALL-E 2 and DALL-E 3 with various quality and style options.

Capabilities

OpenAiImageModel

Implementation for OpenAI image generation, providing Spring-idiomatic APIs for creating images from text prompts.

/**
 * OpenAI image model implementation for DALL-E
 */
public class OpenAiImageModel implements ImageModel {
    /**
     * Generate image(s) from a text prompt
     * @param imagePrompt The prompt containing instructions and options
     * @return ImageResponse containing generated images and metadata
     */
    public ImageResponse call(ImagePrompt imagePrompt);

    /**
     * Set custom observation convention for observability
     * @param observationConvention Custom observation convention
     */
    public void setObservationConvention(ImageModelObservationConvention observationConvention);
}

Constructors:

// Basic constructor
public OpenAiImageModel(OpenAiImageApi openAiImageApi);

// With options and retry support
public OpenAiImageModel(
    OpenAiImageApi openAiImageApi,
    OpenAiImageOptions options,
    RetryTemplate retryTemplate
);

// Full constructor with observability
public OpenAiImageModel(
    OpenAiImageApi openAiImageApi,
    OpenAiImageOptions options,
    RetryTemplate retryTemplate,
    ObservationRegistry observationRegistry
);

Note: OpenAiImageModel uses constructor-based initialization only and does not provide a builder pattern. Use the appropriate constructor based on your configuration needs.

Usage Example:

import org.springframework.ai.openai.OpenAiImageModel;
import org.springframework.ai.openai.OpenAiImageOptions;
import org.springframework.ai.openai.api.OpenAiImageApi;
import org.springframework.ai.image.ImagePrompt;
import org.springframework.ai.image.ImageResponse;
import org.springframework.retry.support.RetryTemplate;

// Create API client
var imageApi = OpenAiImageApi.builder()
    .apiKey(System.getenv("OPENAI_API_KEY"))
    .build();

// Configure retry template
var retryTemplate = RetryTemplate.builder()
    .maxAttempts(3)
    .exponentialBackoff(1000, 2.0, 10000)
    .build();

// Create image model with default options
var imageModel = new OpenAiImageModel(
    imageApi,
    OpenAiImageOptions.builder()
        .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
        .quality("hd")
        .style("vivid")
        .build(),
    retryTemplate
);

// Generate single image
var response = imageModel.call(
    new ImagePrompt("A serene mountain landscape at sunset with a lake reflection")
);

// Access generated images
response.getResults().forEach(result -> {
    System.out.println("Image URL: " + result.getOutput().getUrl());
});

Multiple Images with DALL-E 2:

// DALL-E 2 supports generating multiple images in one request
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_2.getValue())
    .n(4)  // Generate 4 variations
    .size("512x512")
    .build();

var response = imageModel.call(
    new ImagePrompt("A cute robot playing with a ball", options)
);

System.out.println("Generated " + response.getResults().size() + " images");

Custom Size and Quality:

// DALL-E 3 with custom size and quality
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .size("1792x1024")  // Wide format
    .quality("hd")      // High definition
    .style("natural")   // More natural, less artistic
    .build();

var response = imageModel.call(
    new ImagePrompt("A professional product photo of a smartwatch", options)
);

Base64 Response Format:

// Get images as base64-encoded data instead of URLs
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .responseFormat("b64_json")
    .build();

var response = imageModel.call(
    new ImagePrompt("An abstract digital art piece", options)
);

response.getResults().forEach(result -> {
    String base64Data = result.getOutput().getB64Json();
    // Decode and save image
    byte[] imageBytes = Base64.getDecoder().decode(base64Data);
});

OpenAiImageOptions

Configuration options for image generation requests.

/**
 * Configuration options for OpenAI image generation
 */
public class OpenAiImageOptions implements ImageOptions {
    /**
     * Create a new builder for image options
     * @return Builder instance
     */
    public static Builder builder();

    /**
     * Create options from existing options instance
     * @param fromOptions Source options to copy from
     * @return New OpenAiImageOptions instance
     */
    public static OpenAiImageOptions fromOptions(OpenAiImageOptions fromOptions);

    /**
     * Create a copy of these options
     * @return New OpenAiImageOptions with same values
     */
    public OpenAiImageOptions copy();

    /**
     * Get the number of images to generate
     * @return Number of images (1-10 for DALL-E 2, only 1 for DALL-E 3)
     */
    public Integer getN();
    public void setN(Integer n);

    /**
     * Get the image model identifier
     * @return Model name
     */
    public String getModel();
    public void setModel(String model);

    /**
     * Get the image width in pixels
     * @return Width in pixels
     */
    public Integer getWidth();
    public void setWidth(Integer width);

    /**
     * Get the image height in pixels
     * @return Height in pixels
     */
    public Integer getHeight();
    public void setHeight(Integer height);

    /**
     * Get the image quality
     * @return "standard" or "hd" (DALL-E 3 only)
     */
    public String getQuality();
    public void setQuality(String quality);

    /**
     * Get the response format
     * @return "url" or "b64_json"
     */
    public String getResponseFormat();
    public void setResponseFormat(String responseFormat);

    /**
     * Get the image size as a string
     * @return Size string (e.g., "1024x1024")
     */
    public String getSize();
    public void setSize(String size);

    /**
     * Get the image style
     * @return "vivid" or "natural" (DALL-E 3 only)
     */
    public String getStyle();
    public void setStyle(String style);

    /**
     * Get the user identifier for tracking
     * @return User ID
     */
    public String getUser();
    public void setUser(String user);
}

Builder Pattern:

public static class Builder {
    public Builder n(Integer n);
    public Builder model(String model);
    public Builder width(Integer width);
    public Builder height(Integer height);
    public Builder quality(String quality);
    public Builder responseFormat(String responseFormat);
    public Builder size(String size);
    public Builder style(String style);
    public Builder user(String user);
    public OpenAiImageOptions build();
}

Usage Example:

// DALL-E 3 with high quality and vivid style
var dalle3Options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .size("1024x1024")
    .quality("hd")
    .style("vivid")
    .responseFormat("url")
    .build();

// DALL-E 2 with multiple variations
var dalle2Options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_2.getValue())
    .n(3)
    .size("512x512")
    .responseFormat("url")
    .user("user-123")
    .build();

Configuration Parameters Reference

Model (String)

Image generation model identifier:

  • dall-e-2: DALL-E 2 model, supports multiple images per request, lower cost
  • dall-e-3: DALL-E 3 model, higher quality, more accurate, only single image per request

N (Integer)

Number of images to generate per request:

  • DALL-E 2: Range 1-10, default 1
  • DALL-E 3: Only 1 image supported

Size (String)

Dimensions of generated images:

  • DALL-E 2: "256x256", "512x512", "1024x1024"
  • DALL-E 3: "1024x1024" (default), "1792x1024" (wide), "1024x1792" (tall)

Width / Height (Integer)

Alternative to size string, specify dimensions separately:

  • Must match available sizes for the model
  • Either use size string or width/height, not both

Quality (String)

Image quality level (DALL-E 3 only):

  • "standard": Standard quality, faster generation, lower cost (default)
  • "hd": High definition, more detailed, higher cost

Style (String)

Visual style of generated images (DALL-E 3 only):

  • "vivid": Hyper-real and dramatic images (default)
  • "natural": More natural, less stylized images

Response Format (String)

Format for receiving generated images:

  • "url": Returns public URLs to images (default, expires after 1 hour)
  • "b64_json": Returns base64-encoded image data

User (String)

Unique identifier for end-user for tracking and abuse monitoring


Types

Request Types

// High-level image prompt (from spring-ai-core)
public class ImagePrompt {
    public ImagePrompt(String instructions);
    public ImagePrompt(String instructions, ImageOptions options);
    public ImagePrompt(List<ImageMessage> messages);
    public ImagePrompt(List<ImageMessage> messages, ImageOptions options);

    public List<ImageMessage> getInstructions();
    public ImageOptions getOptions();
}

public class ImageMessage implements Message {
    public ImageMessage(String text);
    public String getText();
}

// Low-level image request
public record OpenAiImageRequest(
    String prompt,                       // Text description of desired image(s)
    String model,                        // Model identifier
    Integer n,                           // Number of images
    String quality,                      // Image quality
    String responseFormat,               // Response format
    String size,                         // Image size
    String style,                        // Image style
    String user                          // User identifier
) {}

Response Types

// High-level image response (from spring-ai-core)
public interface ImageResponse {
    List<ImageGeneration> getResults();
    ImageResponseMetadata getMetadata();
}

public class ImageGeneration {
    public Image getOutput();
    public ImageGenerationMetadata getMetadata();
}

public class Image {
    public String getUrl();
    public String getB64Json();
}

// Low-level image response
public record OpenAiImageResponse(
    Long created,                        // Creation timestamp
    List<Data> data                      // Generated images
) {
    public record Data(
        String url,                      // Image URL (if responseFormat=url)
        String b64Json,                  // Base64 image data (if responseFormat=b64_json)
        String revisedPrompt             // DALL-E 3 may revise prompts for safety/quality
    ) {}
}

Metadata Types

public class OpenAiImageGenerationMetadata implements ImageGenerationMetadata {
    public OpenAiImageGenerationMetadata(String revisedPrompt);

    /**
     * Get the revised prompt used for generation
     * DALL-E 3 may modify prompts for safety and quality
     * @return Revised prompt or null if not revised
     */
    public String getRevisedPrompt();
}

public interface ImageResponseMetadata {
    Long getCreated();
}

Model Enums

public enum OpenAiImageApi.ImageModel {
    DALL_E_2("dall-e-2"),
    DALL_E_3("dall-e-3");

    public String getValue();
}

Common Use Cases

Product Visualization

// Generate product images for e-commerce
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .size("1024x1024")
    .quality("hd")
    .style("natural")
    .build();

var response = imageModel.call(new ImagePrompt(
    "Professional product photo of a minimalist desk lamp with warm lighting, " +
    "white background, studio photography",
    options
));

Concept Art Generation

// Create multiple concept variations with DALL-E 2
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_2.getValue())
    .n(4)
    .size("1024x1024")
    .build();

var response = imageModel.call(new ImagePrompt(
    "Futuristic cyberpunk cityscape with neon lights and flying vehicles",
    options
));

// Compare variations and select best

Social Media Content

// Generate wide format image for social media banner
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .size("1792x1024")  // Wide format
    .quality("hd")
    .style("vivid")
    .build();

var response = imageModel.call(new ImagePrompt(
    "Vibrant abstract background for tech company banner, " +
    "blue and purple gradient with geometric patterns",
    options
));

Educational Illustrations

// Create educational diagrams
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .size("1024x1024")
    .style("natural")
    .build();

var response = imageModel.call(new ImagePrompt(
    "Simple labeled diagram showing the water cycle with clouds, rain, " +
    "rivers, and ocean, educational style",
    options
));

Saving Generated Images

import java.nio.file.Files;
import java.nio.file.Paths;
import java.net.URL;
import java.io.InputStream;

// Save from URL
var response = imageModel.call(new ImagePrompt("A beautiful sunset"));
var imageUrl = response.getResult().getOutput().getUrl();

try (InputStream in = new URL(imageUrl).openStream()) {
    Files.copy(in, Paths.get("generated_image.png"));
}

// Save from base64
var options = OpenAiImageOptions.builder()
    .model(OpenAiImageApi.ImageModel.DALL_E_3.getValue())
    .responseFormat("b64_json")
    .build();

var b64Response = imageModel.call(new ImagePrompt("A starry night sky", options));
var base64Data = b64Response.getResult().getOutput().getB64Json();
byte[] imageBytes = Base64.getDecoder().decode(base64Data);
Files.write(Paths.get("generated_image.png"), imageBytes);

Checking Revised Prompts

// DALL-E 3 may revise prompts for safety/quality
var response = imageModel.call(new ImagePrompt("A robot"));

response.getResults().forEach(result -> {
    var metadata = (OpenAiImageGenerationMetadata) result.getMetadata();
    if (metadata.getRevisedPrompt() != null) {
        System.out.println("Original prompt was revised to: " +
            metadata.getRevisedPrompt());
    }
});

Model Comparison

DALL-E 2

  • Supports multiple images per request (up to 10)
  • Lower cost per image
  • Sizes: 256x256, 512x512, 1024x1024
  • Good for concept exploration and variations
  • Faster generation

DALL-E 3

  • Single image per request only
  • Higher quality and accuracy
  • Better prompt understanding
  • Sizes: 1024x1024, 1792x1024, 1024x1792
  • Quality options: standard or hd
  • Style options: vivid or natural
  • May revise prompts for safety/quality
  • Better for final production images

Install with Tessl CLI

npx tessl i tessl/maven-org-springframework-ai--spring-ai-openai@1.1.0

docs

index.md

tile.json