Spring AI integration for Azure OpenAI services providing chat completion, text embeddings, image generation, and audio transcription with GPT, DALL-E, and Whisper models
The image generation API creates images from text descriptions using Azure OpenAI's DALL-E models.
import org.springframework.ai.azure.openai.AzureOpenAiImageModel;
import org.springframework.ai.azure.openai.AzureOpenAiImageOptions;
import org.springframework.ai.azure.openai.metadata.AzureOpenAiImageGenerationMetadata;
import org.springframework.ai.azure.openai.metadata.AzureOpenAiImageResponseMetadata;
import org.springframework.ai.image.ImagePrompt;
import org.springframework.ai.image.ImageResponse;
import org.springframework.ai.image.ImageMessage;
import org.springframework.ai.image.Image;
import com.azure.ai.openai.OpenAIClient;
import com.azure.ai.openai.OpenAIClientBuilder;
import com.azure.core.credential.AzureKeyCredential;The main class for generating images.
Thread-Safe: AzureOpenAiImageModel is fully thread-safe and can be safely used across multiple threads concurrently. A single instance can handle multiple concurrent image generation requests.
Recommendation: Create one instance and reuse it across your application rather than creating new instances for each request.
class AzureOpenAiImageModel implements ImageModel {
AzureOpenAiImageModel(OpenAIClient openAIClient);
AzureOpenAiImageModel(
OpenAIClient microsoftOpenAiClient,
AzureOpenAiImageOptions options
);
}Parameters:
openAIClient: Azure OpenAI client instance (required, non-null, throws NullPointerException if null)options: Default image generation options (optional, uses model defaults if null)Example:
OpenAIClient openAIClient = new OpenAIClientBuilder()
.credential(new AzureKeyCredential(apiKey))
.endpoint(endpoint)
.buildClient();
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.width(1024)
.height(1024)
.build();
AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(openAIClient, options);ImageResponse call(ImagePrompt imagePrompt);Generate one or more images from a text prompt.
Parameters:
imagePrompt: The prompt containing the description and optional options (non-null, throws NullPointerException if null)Returns: ImageResponse containing generated images and metadata (never null)
Throws:
HttpResponseException: HTTP errors from Azure API (400, 401, 403, 429, 500)ResourceNotFoundException: Deployment not found (404)NonTransientAiException: Permanent failures (invalid parameters, content filter)TransientAiException: Temporary failures (rate limits, timeouts)NullPointerException: If imagePrompt is nullIllegalArgumentException: If prompt text is null or empty, or invalid dimension combinationsConstraints:
Example - Basic Usage:
ImagePrompt prompt = new ImagePrompt("A futuristic city at sunset");
ImageResponse response = imageModel.call(prompt);
Image image = response.getResult().getOutput();
String imageUrl = image.getUrl();
System.out.println("Generated image URL: " + imageUrl);Example - With Options:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.width(1792)
.height(1024)
.style("vivid")
.build();
options.setQuality("hd");
ImagePrompt prompt = new ImagePrompt("A serene mountain landscape", options);
ImageResponse response = imageModel.call(prompt);Example - Multiple Images (DALL-E 2 only):
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.model("dall-e-2")
.N(4) // Generate 4 images
.build();
ImagePrompt prompt = new ImagePrompt("Abstract art with vibrant colors", options);
ImageResponse response = imageModel.call(prompt);
for (Image image : response.getResults().stream()
.map(generation -> generation.getOutput())
.toList()) {
System.out.println("Image URL: " + image.getUrl());
}Error Handling:
try {
ImageResponse response = imageModel.call(prompt);
} catch (HttpResponseException e) {
if (e.getResponse().getStatusCode() == 400) {
if (e.getMessage().contains("content_policy_violation")) {
throw new ContentFilterException("Prompt blocked by content filter", e);
} else if (e.getMessage().contains("invalid_image_size")) {
throw new InvalidParameterException("Invalid image dimensions", e);
}
} else if (e.getResponse().getStatusCode() == 429) {
throw new RateLimitException("Rate limit exceeded", e);
}
} catch (IllegalArgumentException e) {
throw new InvalidParameterException("Invalid prompt or options: " + e.getMessage(), e);
}AzureOpenAiImageOptions getDefaultOptions();Retrieve the default options configured for the model.
Returns: AzureOpenAiImageOptions or null if no defaults configured
Configuration class for image generation requests.
class AzureOpenAiImageOptions implements ImageOptions {
static Builder builder();
}class Builder {
Builder N(Integer n);
Builder model(String model);
Builder deploymentName(String deploymentName);
Builder responseFormat(String responseFormat);
Builder width(Integer width);
Builder height(Integer height);
Builder user(String user);
Builder style(String style);
AzureOpenAiImageOptions build();
}Builder Methods:
this for fluent chaining (never null)build(): Returns non-null AzureOpenAiImageOptions instanceInteger getN();
void setN(Integer n);Number of images to generate. Only supported by DALL-E 2 (max 10). DALL-E 3 only supports 1 image.
Constraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.model("dall-e-2")
.N(3)
.build();String getModel();
void setModel(String model);
String getDeploymentName();
void setDeploymentName(String deploymentName);Specifies which DALL-E model to use.
Constraints:
Available Models:
"dall-e-3": Latest model, best quality, more expensive"dall-e-2": Previous generation, good quality, more affordableExample:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.build();Integer getWidth();
void setWidth(Integer width);
Integer getHeight();
void setHeight(Integer height);Dimensions of the generated image in pixels.
DALL-E 3 Supported Sizes:
DALL-E 2 Supported Sizes:
Constraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.width(1792)
.height(1024)
.build();String getSize();
void setSize(String size);Alternative way to specify image dimensions as a string.
Valid Values:
Constraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.build();
options.setSize("1792x1024");String getResponseFormat();
void setResponseFormat(String responseFormat);Format of the generated image data.
Values:
"url": Returns a URL to the generated image (default)"b64_json": Returns base64-encoded JSONConstraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.responseFormat("b64_json")
.build();
ImageResponse response = imageModel.call(new ImagePrompt("A sunset", options));
String base64Data = response.getResult().getOutput().getB64Json();String getQuality();
void setQuality(String quality);Image quality level. Only supported by DALL-E 3.
Values:
"standard": Standard quality (default)"hd": High definition, more detailed imagesConstraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.build();
options.setQuality("hd");String getStyle();
void setStyle(String style);Visual style of the generated image. Only supported by DALL-E 3.
Values:
"vivid": Hyper-real and dramatic images (default)"natural": More natural, less hyper-real imagesConstraints:
Style Differences:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.style("natural")
.build();String getUser();
void setUser(String user);Optional identifier for the end-user, used for abuse monitoring.
Constraints:
Example:
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.user("user-123")
.build();public static final String DEFAULT_IMAGE_MODEL = "dall-e-3";enum ImageModel {
DALL_E_3("dall-e-3"),
DALL_E_2("dall-e-2");
String getValue();
}Example:
String modelName = AzureOpenAiImageOptions.ImageModel.DALL_E_3.getValue();Metadata for individual image generation.
class AzureOpenAiImageGenerationMetadata implements ImageGenerationMetadata {
AzureOpenAiImageGenerationMetadata(String revisedPrompt);
String getRevisedPrompt();
}The revised prompt shows how DALL-E 3 interpreted and potentially modified the original prompt.
Revised Prompt Behavior:
Example:
ImageResponse response = imageModel.call(prompt);
ImageGenerationMetadata metadata = response.getResult().getMetadata();
if (metadata instanceof AzureOpenAiImageGenerationMetadata azureMetadata) {
String revisedPrompt = azureMetadata.getRevisedPrompt();
System.out.println("Revised prompt: " + revisedPrompt);
}Metadata for the overall image response.
class AzureOpenAiImageResponseMetadata extends ImageResponseMetadata {
protected AzureOpenAiImageResponseMetadata(Long created);
Long getCreated();
static AzureOpenAiImageResponseMetadata from(ImageGenerations openAiImageResponse);
}Properties:
created: Unix timestamp when images were generated (seconds since epoch, non-null)Example:
ImageResponse response = imageModel.call(prompt);
AzureOpenAiImageResponseMetadata metadata =
(AzureOpenAiImageResponseMetadata) response.getMetadata();
Long timestamp = metadata.getCreated();
System.out.println("Image created at: " + timestamp);OpenAIClient client = new OpenAIClientBuilder()
.credential(new AzureKeyCredential(apiKey))
.endpoint(endpoint)
.buildClient();
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.build();
AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(client, options);
ImagePrompt prompt = new ImagePrompt("A majestic eagle soaring over mountains");
ImageResponse response = imageModel.call(prompt);
String imageUrl = response.getResult().getOutput().getUrl();
System.out.println("Generated: " + imageUrl);AzureOpenAiImageOptions hdOptions = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.width(1024)
.height(1024)
.style("vivid")
.build();
hdOptions.setQuality("hd");
ImagePrompt prompt = new ImagePrompt(
"A detailed cyberpunk street scene at night with neon lights",
hdOptions
);
ImageResponse response = imageModel.call(prompt);
Image image = response.getResult().getOutput();AzureOpenAiImageOptions landscapeOptions = AzureOpenAiImageOptions.builder()
.width(1792)
.height(1024)
.build();
ImagePrompt prompt = new ImagePrompt(
"A panoramic view of a tropical beach at sunrise",
landscapeOptions
);
ImageResponse response = imageModel.call(prompt);AzureOpenAiImageOptions portraitOptions = AzureOpenAiImageOptions.builder()
.width(1024)
.height(1792)
.build();
ImagePrompt prompt = new ImagePrompt(
"A professional portrait of a scientist in a laboratory",
portraitOptions
);
ImageResponse response = imageModel.call(prompt);AzureOpenAiImageOptions naturalOptions = AzureOpenAiImageOptions.builder()
.style("natural")
.build();
naturalOptions.setQuality("hd");
ImagePrompt prompt = new ImagePrompt(
"A realistic photograph of a coffee shop interior",
naturalOptions
);
ImageResponse response = imageModel.call(prompt);AzureOpenAiImageOptions multiOptions = AzureOpenAiImageOptions.builder()
.model("dall-e-2")
.N(4)
.width(512)
.height(512)
.build();
ImagePrompt prompt = new ImagePrompt("Abstract geometric patterns", multiOptions);
ImageResponse response = imageModel.call(prompt);
System.out.println("Generated " + response.getResults().size() + " images");
for (var result : response.getResults()) {
System.out.println("URL: " + result.getOutput().getUrl());
}AzureOpenAiImageOptions base64Options = AzureOpenAiImageOptions.builder()
.responseFormat("b64_json")
.build();
ImagePrompt prompt = new ImagePrompt("A colorful sunset", base64Options);
ImageResponse response = imageModel.call(prompt);
String base64Image = response.getResult().getOutput().getB64Json();
byte[] imageBytes = Base64.getDecoder().decode(base64Image);
// Save or process image bytesImagePrompt prompt = new ImagePrompt("A cat");
ImageResponse response = imageModel.call(prompt);
// DALL-E 3 may revise the prompt for better results
ImageGenerationMetadata metadata = response.getResult().getMetadata();
if (metadata instanceof AzureOpenAiImageGenerationMetadata azureMetadata) {
String original = "A cat";
String revised = azureMetadata.getRevisedPrompt();
System.out.println("Original: " + original);
System.out.println("Revised: " + revised);
// Revised might be: "A realistic photograph of a fluffy domestic cat..."
}// Default options for most requests
AzureOpenAiImageOptions defaultOptions = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.width(1024)
.height(1024)
.build();
defaultOptions.setQuality("standard");
AzureOpenAiImageModel imageModel = new AzureOpenAiImageModel(client, defaultOptions);
// Override for specific request
AzureOpenAiImageOptions hdOptions = AzureOpenAiImageOptions.builder()
.width(1792)
.height(1024)
.build();
hdOptions.setQuality("hd");
ImagePrompt hdPrompt = new ImagePrompt("A detailed landscape", hdOptions);
ImageResponse response = imageModel.call(hdPrompt);AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.user("user-456")
.build();
ImagePrompt prompt = new ImagePrompt("A robot in a garden", options);
ImageResponse response = imageModel.call(prompt);ImageResponse response = imageModel.call(prompt);
// Image-level metadata
ImageGenerationMetadata imageMetadata = response.getResult().getMetadata();
if (imageMetadata instanceof AzureOpenAiImageGenerationMetadata azureMeta) {
System.out.println("Revised prompt: " + azureMeta.getRevisedPrompt());
}
// Response-level metadata
if (response.getMetadata() instanceof AzureOpenAiImageResponseMetadata responseMeta) {
System.out.println("Created at: " + responseMeta.getCreated());
}// Azure SDK exceptions
com.azure.core.exception.HttpResponseException // HTTP errors (400, 401, 403, 429, 500)
com.azure.core.exception.ResourceNotFoundException // Deployment not found (404)
// Spring AI exceptions
org.springframework.ai.retry.NonTransientAiException // Permanent failures
org.springframework.ai.retry.TransientAiException // Temporary failures (retry-able)
// Java exceptions
java.lang.IllegalArgumentException // Invalid parameters
java.lang.NullPointerException // Null required parameters1. Content Policy Violation (400):
try {
response = imageModel.call(prompt);
} catch (HttpResponseException e) {
if (e.getResponse().getStatusCode() == 400 &&
e.getMessage().contains("content_policy_violation")) {
throw new ContentFilterException(
"Prompt blocked by content policy. Avoid NSFW, violence, or harmful content.", e
);
}
}2. Invalid Image Size (400):
try {
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.width(512) // Invalid for DALL-E 3
.height(512)
.build();
response = imageModel.call(new ImagePrompt("test", options));
} catch (IllegalArgumentException e) {
throw new InvalidParameterException(
"Invalid image size for DALL-E 3. Use 1024x1024, 1792x1024, or 1024x1792", e
);
}3. Too Many Images Requested (400):
try {
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.N(4) // DALL-E 3 only supports 1
.build();
} catch (IllegalArgumentException e) {
throw new InvalidParameterException(
"DALL-E 3 can only generate 1 image per request. Use DALL-E 2 for multiple images.", e
);
}4. Rate Limiting (429):
public ImageResponse generateWithRetry(ImagePrompt prompt) {
int maxRetries = 3;
int baseDelayMs = 2000;
for (int attempt = 0; attempt < maxRetries; attempt++) {
try {
return imageModel.call(prompt);
} catch (HttpResponseException e) {
if (e.getResponse().getStatusCode() == 429 && attempt < maxRetries - 1) {
int delayMs = baseDelayMs * (1 << attempt);
Thread.sleep(delayMs);
continue;
}
throw e;
}
}
throw new RuntimeException("Max retries exceeded");
}5. Prompt Too Long (400):
try {
String veryLongPrompt = generateLongPrompt(5000); // Exceeds 4000 char limit
response = imageModel.call(new ImagePrompt(veryLongPrompt));
} catch (HttpResponseException e) {
if (e.getResponse().getStatusCode() == 400 &&
e.getMessage().contains("prompt")) {
throw new InvalidParameterException(
"Prompt too long. DALL-E 3 max: 4000 chars, DALL-E 2 max: 1000 chars", e
);
}
}Deployment Name:
N (Number of Images):
Width and Height:
Size:
Quality:
Style:
Response Format:
User:
Prompt Text:
Strengths:
Limitations:
Best For: High-quality, detailed images requiring accurate prompt interpretation
Pricing: ~2-4x more expensive than DALL-E 2
Strengths:
Limitations:
Best For: Generating multiple variations, smaller images, cost-sensitive applications
Pricing: More affordable, good for high-volume use cases
// Square (social media posts, avatars, general purpose)
AzureOpenAiImageOptions.builder().width(1024).height(1024).build()
// Landscape (banners, headers, presentations, desktop wallpapers)
AzureOpenAiImageOptions.builder().width(1792).height(1024).build()
// Portrait (mobile screens, story formats, posters)
AzureOpenAiImageOptions.builder().width(1024).height(1792).build()// Small (thumbnails, icons, previews)
AzureOpenAiImageOptions.builder().width(256).height(256).build()
// Medium (general use, web content)
AzureOpenAiImageOptions.builder().width(512).height(512).build()
// Large (detailed images, prints)
AzureOpenAiImageOptions.builder().width(1024).height(1024).build()Good Examples:
"A serene Japanese garden in autumn with red maple leaves, stone lanterns,
and a wooden bridge over a koi pond, photographed at golden hour"
"A futuristic cityscape at night with neon signs, flying vehicles,
and rain-slicked streets, cyberpunk style, dramatic lighting"
"A cozy reading nook with a window seat, warm sunlight, bookshelves,
and a sleeping cat, watercolor painting style"Poor Examples:
"A garden" // Too vague
"Red" // Not descriptive enough
"Something cool" // AmbiguousRecommended:
// Create once at application startup
@Bean
public AzureOpenAiImageModel imageModel() {
return new AzureOpenAiImageModel(client, defaultOptions);
}
// Inject and reuse
@Autowired
private AzureOpenAiImageModel imageModel;Avoid:
// Don't create new instance per request
for (String prompt : prompts) {
AzureOpenAiImageModel model = new AzureOpenAiImageModel(...);
model.call(new ImagePrompt(prompt)); // Inefficient
}DALL-E 3:
DALL-E 2:
ExecutorService executor = Executors.newFixedThreadPool(5);
List<CompletableFuture<ImageResponse>> futures = new ArrayList<>();
for (String promptText : prompts) {
CompletableFuture<ImageResponse> future = CompletableFuture.supplyAsync(
() -> imageModel.call(new ImagePrompt(promptText)),
executor
);
futures.add(future);
}
// Wait for all images
List<ImageResponse> responses = futures.stream()
.map(CompletableFuture::join)
.collect(Collectors.toList());Generated image URLs expire after 1 hour:
public void saveGeneratedImage(String prompt) {
ImageResponse response = imageModel.call(new ImagePrompt(prompt));
String imageUrl = response.getResult().getOutput().getUrl();
// Download and save immediately (URL expires in 1 hour)
byte[] imageBytes = downloadImage(imageUrl);
saveToStorage(imageBytes);
}
public void saveAsBase64(String prompt) {
AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.responseFormat("b64_json")
.build();
ImageResponse response = imageModel.call(new ImagePrompt(prompt, options));
String base64Image = response.getResult().getOutput().getB64Json();
// No expiration concerns with base64
saveBase64ToStorage(base64Image);
}Symptoms: 400 error with content_policy_violation
Common Triggers:
Solutions:
Symptoms: Blurry, artifacts, incorrect details
Solutions:
Symptoms: Images don't match prompt expectations
Solutions:
Symptoms: Unreadable text in generated images
Known Limitation: Both DALL-E 2 and DALL-E 3 struggle with text rendering
Workarounds:
Symptoms: Images take longer than expected
Causes & Solutions:
Symptoms: 429 errors
Solutions:
DO:
DON'T:
Strategies:
For Best Results:
URL-based (default):
Base64-based:
// Download URL-based image
public byte[] downloadAndSave(String imageUrl) {
byte[] imageBytes = downloadImage(imageUrl);
String filename = "generated_" + System.currentTimeMillis() + ".png";
Files.write(Path.of("images/" + filename), imageBytes);
return imageBytes;
}
// Handle base64 image
public void saveBase64Image(String base64Image) {
byte[] imageBytes = Base64.getDecoder().decode(base64Image);
String filename = "generated_" + System.currentTimeMillis() + ".png";
Files.write(Path.of("images/" + filename), imageBytes);
}AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.style("natural")
.width(1024)
.height(1024)
.build();
options.setQuality("hd");
ImagePrompt prompt = new ImagePrompt(
"Professional product photography of a sleek smartphone on a minimalist desk, " +
"soft studio lighting, white background, high-end commercial photography style",
options
);AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.width(1792)
.height(1024)
.style("vivid")
.build();
options.setQuality("hd");
ImagePrompt prompt = new ImagePrompt(
"Eye-catching banner for summer sale with vibrant tropical colors, " +
"beach theme, modern graphic design, energetic and appealing",
options
);AzureOpenAiImageOptions options = AzureOpenAiImageOptions.builder()
.deploymentName("dall-e-3")
.style("vivid")
.build();
options.setQuality("hd");
ImagePrompt prompt = new ImagePrompt(
"Concept art of a futuristic underwater city with bioluminescent architecture, " +
"glass domes, submarines, dramatic lighting, detailed digital art",
options
);// Square format for Instagram
AzureOpenAiImageOptions socialOptions = AzureOpenAiImageOptions.builder()
.width(1024)
.height(1024)
.style("vivid")
.build();
ImagePrompt prompt = new ImagePrompt(
"Inspiring motivational quote background with abstract geometric patterns, " +
"warm gradient colors, modern minimalist design",
socialOptions
);// Generate multiple variations quickly
AzureOpenAiImageOptions variantOptions = AzureOpenAiImageOptions.builder()
.model("dall-e-2")
.N(4)
.width(512)
.height(512)
.build();
ImagePrompt prompt = new ImagePrompt(
"Abstract logo design with geometric shapes and blue colors",
variantOptions
);
ImageResponse response = imageModel.call(prompt);
// Select best variant from 4 optionstessl i tessl/maven-org-springframework-ai--spring-ai-azure-openai@1.1.1