Spring AI integration for Azure OpenAI services providing chat completion, text embeddings, image generation, and audio transcription with GPT, DALL-E, and Whisper models
Spring AI Azure OpenAI provides integration between Spring AI and Azure OpenAI services. It implements Spring AI's model interfaces for chat completion, text embeddings, image generation, and audio transcription using Azure's OpenAI platform.
Installation:
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-azure-openai</artifactId>
<version>1.1.2</version>
</dependency>See Quick Start Guide for step-by-step setup instructions.
This package provides four main AI model implementations:
| Capability | Model Class | Use Case |
|---|---|---|
| Chat Completion | AzureOpenAiChatModel | Conversational AI with GPT models |
| Text Embeddings | AzureOpenAiEmbeddingModel | Vector embeddings for semantic search |
| Image Generation | AzureOpenAiImageModel | Image creation with DALL-E |
| Audio Transcription | AzureOpenAiAudioTranscriptionModel | Speech-to-text with Whisper |
All models support:
class AzureOpenAiChatModel implements ChatModel {
ChatResponse call(Prompt prompt);
Flux<ChatResponse> stream(Prompt prompt);
}Key Features: Synchronous/streaming responses, tool calling, structured outputs, JSON schema support
Supported Models: gpt-4o, gpt-4, gpt-35-turbo, o1, o3, o4-mini
class AzureOpenAiEmbeddingModel extends AbstractEmbeddingModel {
EmbeddingResponse call(EmbeddingRequest request);
float[] embed(Document document);
}Key Features: Configurable dimensions, batch processing, metadata handling
Supported Models: text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large
class AzureOpenAiImageModel implements ImageModel {
ImageResponse call(ImagePrompt prompt);
}Key Features: DALL-E 2/3 support, size/quality/style controls, multiple formats
Supported Models: dall-e-3, dall-e-2
class AzureOpenAiAudioTranscriptionModel implements TranscriptionModel {
AudioTranscriptionResponse call(AudioTranscriptionPrompt prompt);
String call(Resource audioResource);
}Key Features: Multiple formats (JSON, SRT, VTT), timestamps, language detection
Supported Models: whisper
All models require an OpenAIClient:
OpenAIClient client = new OpenAIClientBuilder()
.credential(new AzureKeyCredential(apiKey))
.endpoint(endpoint)
.buildClient();Azure OpenAI uses deployment names instead of model names:
| Type | Example Deployments |
|---|---|
| Chat | gpt-4o, gpt-4, gpt-35-turbo, o1, o3 |
| Embeddings | text-embedding-ada-002, text-embedding-3-small |
| Images | dall-e-3, dall-e-2 |
| Audio | whisper |
Builder Pattern:
AzureOpenAiChatModel chatModel = AzureOpenAiChatModel.builder()
.openAIClientBuilder(clientBuilder)
.defaultOptions(options)
.observationRegistry(observationRegistry)
.build();Options Configuration:
AzureOpenAiChatOptions options = AzureOpenAiChatOptions.builder()
.deploymentName("gpt-4o")
.temperature(0.7)
.maxTokens(1000)
.build();All model classes are fully thread-safe and can handle concurrent requests. Create one instance and reuse it across your application.
| Parameter | Range | Default | Description |
|---|---|---|---|
| temperature | 0.0-2.0 | 0.7 | Randomness control |
| maxTokens | > 0 | - | Total token limit (standard models) |
| maxCompletionTokens | > 0 | - | Completion limit (reasoning models) |
| topP | 0.0-1.0 | 1.0 | Nucleus sampling |
| frequencyPenalty | -2.0-2.0 | 0.0 | Repetition penalty |
| presencePenalty | -2.0-2.0 | 0.0 | Topic diversity |
| Model | Default | Configurable | Max Input |
|---|---|---|---|
| text-embedding-ada-002 | 1536 | No | 8191 tokens |
| text-embedding-3-small | 1536 | Yes (up to 1536) | 8191 tokens |
| text-embedding-3-large | 3072 | Yes (up to 3072) | 8191 tokens |
| Model | Supported Sizes |
|---|---|
| DALL-E 3 | 1024×1024, 1792×1024, 1024×1792 |
| DALL-E 2 | 256×256, 512×512, 1024×1024 |
| Format | Extension | Quality | Use Case |
|---|---|---|---|
| mp3 | .mp3 | Good | General purpose |
| wav | .wav | Excellent | Best quality |
| m4a | .m4a | Good | Mobile recordings |
| webm | .webm | Good | Web recordings |
File Limit: 25 MB maximum
// Chat
import org.springframework.ai.azure.openai.AzureOpenAiChatModel;
import org.springframework.ai.azure.openai.AzureOpenAiChatOptions;
// Embeddings
import org.springframework.ai.azure.openai.AzureOpenAiEmbeddingModel;
import org.springframework.ai.azure.openai.AzureOpenAiEmbeddingOptions;
// Images
import org.springframework.ai.azure.openai.AzureOpenAiImageModel;
import org.springframework.ai.azure.openai.AzureOpenAiImageOptions;
// Audio
import org.springframework.ai.azure.openai.AzureOpenAiAudioTranscriptionModel;
import org.springframework.ai.azure.openai.AzureOpenAiAudioTranscriptionOptions;
// Azure Client
import com.azure.ai.openai.OpenAIClient;
import com.azure.ai.openai.OpenAIClientBuilder;
import com.azure.core.credential.AzureKeyCredential;Utility for merging chat completion responses in streaming scenarios.
class MergeUtils {
static ChatCompletions emptyChatCompletions();
static ChatCompletions mergeChatCompletions(ChatCompletions left, ChatCompletions right);
}Provides GraalVM native image compilation hints. Automatically detected by Spring AOT - no manual configuration needed.
class AzureOpenAiRuntimeHints implements RuntimeHintsRegistrar {
void registerHints(RuntimeHints hints, ClassLoader classLoader);
}org.springframework.ai.azure.openai
├── AzureOpenAiChatModel
├── AzureOpenAiChatOptions
├── AzureOpenAiResponseFormat
├── AzureOpenAiEmbeddingModel
├── AzureOpenAiEmbeddingOptions
├── AzureOpenAiImageModel
├── AzureOpenAiImageOptions
├── AzureOpenAiAudioTranscriptionModel
├── AzureOpenAiAudioTranscriptionOptions
├── MergeUtils
├── metadata/
│ ├── AzureOpenAiAudioTranscriptionResponseMetadata
│ ├── AzureOpenAiImageGenerationMetadata
│ └── AzureOpenAiImageResponseMetadata
└── aot/
└── AzureOpenAiRuntimeHintstessl i tessl/maven-org-springframework-ai--spring-ai-azure-openai@1.1.1