LangChain4j integration for Azure OpenAI providing chat, streaming, embeddings, image generation, audio transcription, and token counting capabilities
Common configuration patterns, authentication methods, and builder options shared across all Azure OpenAI models.
import com.azure.core.credential.TokenCredential;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.azure.identity.ManagedIdentityCredentialBuilder;
import com.azure.identity.ClientSecretCredentialBuilder;
import com.azure.identity.AzureCliCredentialBuilder;
import com.azure.core.http.policy.RetryOptions;
import com.azure.core.http.policy.ExponentialBackoffOptions;
import com.azure.core.http.ProxyOptions;
import com.azure.core.http.HttpClientProvider;
import com.azure.core.http.HttpClient;
import com.azure.ai.openai.OpenAIClient;
import com.azure.ai.openai.OpenAIAsyncClient;
import com.azure.ai.openai.OpenAIClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
import java.time.Duration;
import java.util.Map;
import java.util.List;
import java.net.InetSocketAddress;All model builders support three authentication methods. Exactly one must be specified. Choose based on your deployment scenario and security requirements.
Standard authentication using an API key from your Azure OpenAI resource:
/**
* Configure API key authentication.
* @param apiKey 32-character hexadecimal key from Azure Portal
* @throws IllegalArgumentException if apiKey is null or empty
*/
model.builder()
.endpoint("https://your-resource.openai.azure.com/")
.apiKey("your-api-key-from-azure-portal")
.build();Best for:
Security note: Never hardcode API keys in source code. Use environment variables or secure secret storage:
// Recommended: Load from environment
String apiKey = System.getenv("AZURE_OPENAI_API_KEY");
if (apiKey == null || apiKey.isEmpty()) {
throw new IllegalStateException("AZURE_OPENAI_API_KEY environment variable not set");
}
model.builder()
.endpoint(System.getenv("AZURE_OPENAI_ENDPOINT"))
.apiKey(apiKey)
.build();Key format: Azure OpenAI API keys are 32-character hexadecimal strings. Validate before use:
// Validation pattern (optional but recommended)
if (!apiKey.matches("[0-9a-fA-F]{32}")) {
throw new IllegalArgumentException("Invalid Azure OpenAI API key format");
}Authenticate with the non-Azure OpenAI service (api.openai.com):
/**
* Configure non-Azure OpenAI authentication.
* Automatically sets endpoint to https://api.openai.com/v1.
* Do NOT call endpoint() when using this method.
* @param apiKey OpenAI API key starting with "sk-"
* @throws IllegalArgumentException if apiKey is null or empty
*/
model.builder()
.nonAzureApiKey("your-openai-api-key")
.deploymentName("gpt-4") // Use OpenAI model name
.serviceVersion("2024-02-15-preview")
.build();Note: When using nonAzureApiKey(), the endpoint is automatically set to https://api.openai.com/v1. Do not call endpoint().
Best for:
Key format: OpenAI API keys start with "sk-" prefix. Example validation:
if (!openAiKey.startsWith("sk-")) {
throw new IllegalArgumentException("OpenAI API keys must start with 'sk-'");
}Authenticate using Azure Active Directory (Microsoft Entra ID):
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.azure.core.credential.TokenCredential;
/**
* Configure Azure AD authentication.
* Provides zero-secret authentication using managed identities or service principals.
* @param credential TokenCredential implementation
* @throws IllegalArgumentException if credential is null
*/
TokenCredential credential = new DefaultAzureCredentialBuilder().build();
model.builder()
.endpoint("https://your-resource.openai.azure.com/")
.tokenCredential(credential)
.deploymentName("gpt-4")
.serviceVersion("2024-02-15-preview")
.build();Best for:
Common credential types:
// Default credential chain (tries multiple auth methods in order)
// Order: Environment -> Managed Identity -> Azure CLI -> IntelliJ -> Visual Studio Code
TokenCredential credential = new DefaultAzureCredentialBuilder()
.build();
// User-assigned managed identity (specify client ID)
TokenCredential credential = new ManagedIdentityCredentialBuilder()
.clientId("your-managed-identity-client-id")
.build();
// System-assigned managed identity (no client ID needed)
TokenCredential credential = new ManagedIdentityCredentialBuilder()
.build();
// Service principal with client secret
TokenCredential credential = new ClientSecretCredentialBuilder()
.tenantId("your-tenant-id")
.clientId("your-client-id")
.clientSecret("your-client-secret")
.build();
// Azure CLI credential (for local development)
// Uses credentials from `az login`
TokenCredential credential = new AzureCliCredentialBuilder()
.build();Required Azure RBAC roles:
Assign roles using Azure Portal, CLI, or ARM templates:
# Assign role to managed identity
az role assignment create \
--role "Cognitive Services OpenAI User" \
--assignee <managed-identity-client-id> \
--scope /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<resource-name>All models require these three configuration parameters. All are mandatory unless using nonAzureApiKey().
/**
* Mandatory configuration interface.
* @param <T> Builder type for fluent chaining
*/
interface MandatoryConfiguration<T> {
/**
* Sets the Azure OpenAI resource endpoint.
* Required: Yes (except when using nonAzureApiKey())
* Format: https://{resource-name}.openai.azure.com/
* @param endpoint Full endpoint URL with trailing slash optional
* @return Builder instance for chaining
* @throws IllegalArgumentException if endpoint is null, empty, or malformed
*/
T endpoint(String endpoint);
/**
* Sets the Azure OpenAI API version.
* Required: Yes
* Examples: "2024-02-15-preview", "2023-12-01-preview"
* Recommendation: Use latest preview for development, latest stable for production
* @param serviceVersion API version string
* @return Builder instance for chaining
* @throws IllegalArgumentException if serviceVersion is null or empty
*/
T serviceVersion(String serviceVersion);
/**
* Sets the name of the deployed model in Azure OpenAI.
* Required: Yes
* This is YOUR deployment name in Azure, not the base model name.
* Examples: "gpt-4-deployment", "my-gpt35-turbo", "dall-e-3-prod"
* @param deploymentName Your Azure deployment name
* @return Builder instance for chaining
* @throws IllegalArgumentException if deploymentName is null or empty
*/
T deploymentName(String deploymentName);
}Example with validation:
String endpoint = System.getenv("AZURE_OPENAI_ENDPOINT");
String deployment = System.getenv("AZURE_OPENAI_DEPLOYMENT");
String version = "2024-02-15-preview";
// Validate mandatory parameters
if (endpoint == null || endpoint.isEmpty()) {
throw new IllegalStateException("AZURE_OPENAI_ENDPOINT not configured");
}
if (deployment == null || deployment.isEmpty()) {
throw new IllegalStateException("AZURE_OPENAI_DEPLOYMENT not configured");
}
model.builder()
.endpoint(endpoint)
.serviceVersion(version)
.deploymentName(deployment)
.apiKey(System.getenv("AZURE_OPENAI_API_KEY"))
.build();Set request timeout duration to prevent indefinite hangs.
/**
* Sets the request timeout.
* @param timeout Duration, must be positive
* @return Builder instance
* @default 60 seconds (chat, embedding, language models)
* @default 120 seconds (streaming models, image models)
* @throws IllegalArgumentException if timeout is null or non-positive
*/
model.builder()
.timeout(Duration.ofSeconds(60)) // 60 second timeout
.build();Default timeouts by model type:
Recommended timeouts:
Timeout behavior:
TimeoutException when exceeded// Example: Different timeouts for different use cases
AzureOpenAiChatModel fastModel = AzureOpenAiChatModel.builder()
.timeout(Duration.ofSeconds(30)) // Short timeout for quick responses
.maxTokens(100) // Limit response length
.build();
AzureOpenAiChatModel slowModel = AzureOpenAiChatModel.builder()
.timeout(Duration.ofSeconds(180)) // Longer timeout for detailed responses
.maxTokens(4000) // Allow long responses
.build();Configure retry behavior for failed requests with automatic exponential backoff.
Simple retry count:
/**
* Sets simple retry count with default exponential backoff.
* Mutually exclusive with retryOptions().
* @param maxRetries Number of retries, 0-10
* @return Builder instance
* @default 3 retries
* @throws IllegalArgumentException if maxRetries < 0 or > 10
*/
model.builder()
.maxRetries(3) // Retry up to 3 times
.build();Advanced retry options:
import com.azure.core.http.policy.RetryOptions;
import com.azure.core.http.policy.ExponentialBackoffOptions;
/**
* Sets advanced retry options with custom backoff strategy.
* Mutually exclusive with maxRetries().
* @param retryOptions Azure SDK retry configuration
* @return Builder instance
* @default 3 retries with exponential backoff (1s base, 10s max)
*/
RetryOptions retryOptions = new RetryOptions(
new ExponentialBackoffOptions()
.setMaxRetries(3) // Maximum 3 retry attempts
.setBaseDelay(Duration.ofSeconds(1)) // Start with 1s delay
.setMaxDelay(Duration.ofSeconds(10)) // Maximum 10s delay
);
model.builder()
.retryOptions(retryOptions)
.build();Default retry behavior:
Retry triggers (automatically retried):
Non-retried errors (fail immediately):
Example custom retry strategy:
// Aggressive retry for production
RetryOptions aggressive = new RetryOptions(
new ExponentialBackoffOptions()
.setMaxRetries(5) // More retries
.setBaseDelay(Duration.ofMillis(500)) // Faster initial retry
.setMaxDelay(Duration.ofSeconds(30)) // Longer max delay
);
// Conservative retry for development
RetryOptions conservative = new RetryOptions(
new ExponentialBackoffOptions()
.setMaxRetries(1) // Fail fast
.setBaseDelay(Duration.ofSeconds(2))
.setMaxDelay(Duration.ofSeconds(5))
);Route requests through an HTTP proxy server.
Basic proxy:
import com.azure.core.http.ProxyOptions;
import java.net.InetSocketAddress;
/**
* Sets HTTP proxy configuration.
* @param proxyOptions Proxy settings including type and address
* @return Builder instance
* @default No proxy
*/
ProxyOptions proxyOptions = new ProxyOptions(
ProxyOptions.Type.HTTP, // or Type.SOCKS
new InetSocketAddress("proxy.example.com", 8080)
);
model.builder()
.proxyOptions(proxyOptions)
.build();Proxy with authentication:
ProxyOptions proxyOptions = new ProxyOptions(
ProxyOptions.Type.HTTP,
new InetSocketAddress("proxy.example.com", 8080)
)
.setCredentials("proxy-username", "proxy-password") // Optional authentication
.setNonProxyHosts("localhost|127.0.0.1"); // Bypass proxy for these hosts
model.builder()
.proxyOptions(proxyOptions)
.build();Proxy types:
ProxyOptions.Type.HTTP: HTTP/HTTPS proxy (most common)ProxyOptions.Type.SOCKS: SOCKS4/SOCKS5 proxyCommon proxy scenarios:
// Corporate proxy with authentication
ProxyOptions corpProxy = new ProxyOptions(
ProxyOptions.Type.HTTP,
new InetSocketAddress("proxy.corp.example.com", 8080)
)
.setCredentials(System.getenv("PROXY_USER"), System.getenv("PROXY_PASS"))
.setNonProxyHosts("localhost|*.internal.corp");
// Development proxy (e.g., Fiddler, Charles)
ProxyOptions debugProxy = new ProxyOptions(
ProxyOptions.Type.HTTP,
new InetSocketAddress("localhost", 8888) // Fiddler default
);Provide a custom HTTP client implementation for advanced scenarios.
import com.azure.core.http.HttpClientProvider;
import com.azure.core.http.HttpClient;
/**
* Sets custom HTTP client provider.
* Allows full control over HTTP client configuration.
* @param httpClientProvider Provider for creating HTTP client
* @return Builder instance
* @default Azure SDK default HTTP client (Netty or OkHttp)
*/
HttpClientProvider customProvider = new HttpClientProvider() {
@Override
public HttpClient createInstance() {
// Return custom-configured HTTP client
return HttpClient.createDefault()
.setConnectionTimeout(Duration.ofSeconds(30))
.setReadTimeout(Duration.ofSeconds(60))
.setWriteTimeout(Duration.ofSeconds(30));
}
};
model.builder()
.httpClientProvider(customProvider)
.build();Use cases for custom HTTP client:
Add custom HTTP headers to all requests for tracking, authentication, or API versioning.
/**
* Sets custom HTTP headers added to all requests.
* @param customHeaders Immutable map of header name to value
* @return Builder instance
* @default Empty map (no custom headers)
*/
Map<String, String> customHeaders = Map.of(
"X-Custom-Header", "custom-value",
"X-Request-ID", "unique-request-id",
"X-API-Version", "v1",
"X-Correlation-ID", UUID.randomUUID().toString()
);
model.builder()
.customHeaders(customHeaders)
.build();Common use cases:
Important notes:
// Example: Production tracking headers
Map<String, String> trackingHeaders = Map.of(
"X-Application-Name", "MyApp",
"X-Application-Version", "1.2.3",
"X-Environment", "production",
"X-Request-ID", UUID.randomUUID().toString()
);Append a custom suffix to the User-Agent header for application identification.
/**
* Sets custom User-Agent suffix for request identification.
* @param userAgentSuffix Suffix string appended to SDK user agent
* @return Builder instance
* @default None (only SDK user agent)
*/
model.builder()
.userAgentSuffix("MyApp/1.0.0")
.build();Resulting User-Agent format:
azsdk-java-azure-ai-openai/{sdk-version} ({os-info}) MyApp/1.0.0Best for:
Recommended format: AppName/Version
// Example: Version-aware user agent
String appVersion = "1.2.3";
String buildNumber = "456";
model.builder()
.userAgentSuffix(String.format("MyApp/%s (build %s)", appVersion, buildNumber))
.build();For advanced scenarios, provide a pre-configured OpenAI client instance instead of using builder configuration.
import com.azure.ai.openai.OpenAIClient;
import com.azure.ai.openai.OpenAIClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
/**
* Creates model using a custom OpenAI client.
* Useful for sharing a single client across multiple models or advanced configuration.
* @param client Pre-configured OpenAIClient instance
* @return Builder instance
*/
OpenAIClient customClient = new OpenAIClientBuilder()
.endpoint("https://your-resource.openai.azure.com/")
.credential(new AzureKeyCredential("your-api-key"))
.httpLogOptions(new HttpLogOptions().setLogLevel(HttpLogDetailLevel.BODY_AND_HEADERS))
.buildClient();
AzureOpenAiChatModel model = AzureOpenAiChatModel.builder()
.openAIClient(customClient)
.deploymentName("gpt-4")
.serviceVersion("2024-02-15-preview")
.build();import com.azure.ai.openai.OpenAIAsyncClient;
import com.azure.ai.openai.OpenAIClientBuilder;
/**
* Creates streaming model using a custom async OpenAI client.
* @param client Pre-configured OpenAIAsyncClient instance
* @return Builder instance
*/
OpenAIAsyncClient asyncClient = new OpenAIClientBuilder()
.endpoint("https://your-resource.openai.azure.com/")
.credential(new AzureKeyCredential("your-api-key"))
.buildAsyncClient();
AzureOpenAiStreamingChatModel model = AzureOpenAiStreamingChatModel.builder()
.openAIAsyncClient(asyncClient)
.deploymentName("gpt-4")
.serviceVersion("2024-02-15-preview")
.build();When to use custom client:
Benefits:
// Example: Shared client for multiple models
OpenAIClient sharedClient = new OpenAIClientBuilder()
.endpoint(endpoint)
.credential(new AzureKeyCredential(apiKey))
.buildClient();
// Create multiple models sharing the same client
AzureOpenAiChatModel chatModel = AzureOpenAiChatModel.builder()
.openAIClient(sharedClient)
.deploymentName("gpt-4")
.build();
AzureOpenAiEmbeddingModel embeddingModel = AzureOpenAiEmbeddingModel.builder()
.openAIClient(sharedClient)
.deploymentName("text-embedding-ada-002")
.build();Enable detailed logging of HTTP requests and responses for debugging and monitoring.
/**
* Enables detailed HTTP logging of requests and responses.
* WARNING: Logs full request/response body including prompts, completions, and API keys in headers.
* Only enable in secure environments with proper log protection.
* @param logRequestsAndResponses true to enable logging
* @return Builder instance
* @default false (logging disabled)
*/
model.builder()
.logRequestsAndResponses(true)
.build();Logs include:
Security warning:
Log output format:
--> POST https://resource.openai.azure.com/openai/deployments/gpt-4/chat/completions?api-version=2024-02-15-preview
api-key: ********************************
Content-Type: application/json
{"messages":[{"role":"user","content":"Hello"}],"temperature":0.7}
<-- 200 OK (1234ms)
Content-Type: application/json
{"choices":[{"message":{"role":"assistant","content":"Hi there!"},"finish_reason":"stop"}],"usage":{"prompt_tokens":8,"completion_tokens":4,"total_tokens":12}}Monitor chat model events for metrics, cost tracking, and observability (chat models only).
import dev.langchain4j.model.chat.listener.ChatModelListener;
import dev.langchain4j.model.chat.listener.ChatModelRequest;
import dev.langchain4j.model.chat.listener.ChatModelResponse;
/**
* Registers chat model event listeners.
* Listeners receive callbacks for request, response, and error events.
* All listeners are called synchronously before request/after response.
* @param listeners List of listener implementations
* @return Builder instance
* @default Empty list (no listeners)
*/
ChatModelListener listener = new ChatModelListener() {
@Override
public void onRequest(ChatModelRequest request) {
// Called before sending request
System.out.println("Request: " + request.messages().size() + " messages");
System.out.println("Model: " + request.model());
System.out.println("Temperature: " + request.parameters().temperature());
}
@Override
public void onResponse(ChatModelResponse response) {
// Called after receiving successful response
System.out.println("Response: " + response.aiMessage().text());
System.out.println("Input tokens: " + response.tokenUsage().inputTokens());
System.out.println("Output tokens: " + response.tokenUsage().outputTokens());
System.out.println("Finish reason: " + response.metadata().finishReason());
}
@Override
public void onError(Throwable error) {
// Called on request failure
System.err.println("Error: " + error.getMessage());
}
};
model.builder()
.listeners(List.of(listener))
.build();Use cases:
Multiple listeners:
// Metrics listener
ChatModelListener metricsListener = new ChatModelListener() {
@Override
public void onResponse(ChatModelResponse response) {
metrics.recordTokens(response.tokenUsage().totalTokens());
metrics.recordLatency(response.metadata().latency());
}
};
// Cost tracking listener
ChatModelListener costListener = new ChatModelListener() {
@Override
public void onResponse(ChatModelResponse response) {
double cost = calculateCost(response.tokenUsage());
costTracker.recordCost(userId, cost);
}
};
// Audit logging listener
ChatModelListener auditListener = new ChatModelListener() {
@Override
public void onRequest(ChatModelRequest request) {
auditLog.logRequest(userId, request);
}
@Override
public void onResponse(ChatModelResponse response) {
auditLog.logResponse(userId, response);
}
};
// Register all listeners
model.builder()
.listeners(List.of(metricsListener, costListener, auditListener))
.build();Listener behavior:
Secure, reliable configuration for production environments.
// Use managed identity (zero-secret authentication)
TokenCredential credential = new DefaultAzureCredentialBuilder()
.build();
// Load configuration from environment
String endpoint = System.getenv("AZURE_OPENAI_ENDPOINT");
String deployment = System.getenv("AZURE_OPENAI_DEPLOYMENT");
// Production-grade chat model
AzureOpenAiChatModel model = AzureOpenAiChatModel.builder()
// Authentication - use managed identity
.endpoint(endpoint)
.tokenCredential(credential)
.deploymentName(deployment)
.serviceVersion("2024-02-15-preview")
// Reliability - aggressive retries and reasonable timeout
.timeout(Duration.ofSeconds(60))
.maxRetries(3)
// Observability - use listeners, not full logging
.listeners(List.of(metricsListener, auditListener))
.userAgentSuffix("MyApp/1.0.0")
// Quality - set appropriate parameters
.temperature(0.7)
.maxTokens(2000)
.build();Development-friendly configuration with enhanced debugging.
AzureOpenAiChatModel model = AzureOpenAiChatModel.builder()
// Authentication - use API key for simplicity
.endpoint("https://my-resource.openai.azure.com/")
.apiKey(System.getenv("AZURE_OPENAI_API_KEY"))
.deploymentName("gpt-4")
.serviceVersion("2024-02-15-preview")
// Debug - enable full logging, longer timeout
.logRequestsAndResponses(true)
.timeout(Duration.ofSeconds(120)) // Longer for debugging
// Conservative retries for faster failure
.maxRetries(1)
.build();Enterprise deployment with corporate proxy and service principal.
// Corporate proxy with authentication
ProxyOptions proxy = new ProxyOptions(
ProxyOptions.Type.HTTP,
new InetSocketAddress("proxy.corp.example.com", 8080)
).setCredentials(
System.getenv("PROXY_USER"),
System.getenv("PROXY_PASSWORD")
);
// Service principal authentication
TokenCredential credential = new ClientSecretCredentialBuilder()
.tenantId(System.getenv("AZURE_TENANT_ID"))
.clientId(System.getenv("AZURE_CLIENT_ID"))
.clientSecret(System.getenv("AZURE_CLIENT_SECRET"))
.build();
AzureOpenAiChatModel model = AzureOpenAiChatModel.builder()
// Authentication
.endpoint(System.getenv("AZURE_OPENAI_ENDPOINT"))
.tokenCredential(credential)
.deploymentName(System.getenv("AZURE_OPENAI_DEPLOYMENT"))
.serviceVersion("2024-02-15-preview")
// Network - proxy and custom headers
.proxyOptions(proxy)
.customHeaders(Map.of(
"X-Corp-ID", System.getenv("CORP_ID"),
"X-Cost-Center", System.getenv("COST_CENTER")
))
// Reliability
.timeout(Duration.ofSeconds(90))
.maxRetries(5)
.build();Share configuration across multiple model types for consistency.
// Shared configuration
String endpoint = "https://my-resource.openai.azure.com/";
String apiKey = System.getenv("AZURE_OPENAI_API_KEY");
String serviceVersion = "2024-02-15-preview";
Duration timeout = Duration.ofSeconds(60);
int maxRetries = 3;
// Chat model
AzureOpenAiChatModel chatModel = AzureOpenAiChatModel.builder()
.endpoint(endpoint)
.apiKey(apiKey)
.serviceVersion(serviceVersion)
.deploymentName("gpt-4")
.timeout(timeout)
.maxRetries(maxRetries)
.build();
// Embedding model
AzureOpenAiEmbeddingModel embeddingModel = AzureOpenAiEmbeddingModel.builder()
.endpoint(endpoint)
.apiKey(apiKey)
.serviceVersion(serviceVersion)
.deploymentName("text-embedding-ada-002")
.timeout(timeout)
.maxRetries(maxRetries)
.build();
// Image model (longer timeout)
AzureOpenAiImageModel imageModel = AzureOpenAiImageModel.builder()
.endpoint(endpoint)
.apiKey(apiKey)
.serviceVersion(serviceVersion)
.deploymentName("dall-e-3")
.timeout(Duration.ofSeconds(120)) // Longer for images
.maxRetries(maxRetries)
.build();Standard environment variable names for configuration.
# Azure OpenAI configuration
export AZURE_OPENAI_ENDPOINT="https://my-resource.openai.azure.com/"
export AZURE_OPENAI_API_KEY="your-api-key"
export AZURE_OPENAI_DEPLOYMENT_NAME="gpt-4"
# Non-Azure OpenAI configuration
export OPENAI_API_KEY="your-openai-api-key"
# Azure AD / Service Principal configuration
export AZURE_TENANT_ID="your-tenant-id"
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"
# Proxy configuration
export HTTPS_PROXY="http://proxy.example.com:8080"
export HTTP_PROXY="http://proxy.example.com:8080"
export NO_PROXY="localhost,127.0.0.1,*.internal"
export PROXY_USER="proxy-username"
export PROXY_PASSWORD="proxy-password"Usage in code:
model.builder()
.endpoint(System.getenv("AZURE_OPENAI_ENDPOINT"))
.apiKey(System.getenv("AZURE_OPENAI_API_KEY"))
.deploymentName(System.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"))
.serviceVersion("2024-02-15-preview")
.build();Azure OpenAI service versions ordered by release date (use latest for newest features).
Available versions:
2024-02-15-preview - Latest preview - Newest features, GPT-4 Turbo, function calling v22023-12-01-preview - Stable release - GPT-4 Turbo, vision, DALL-E 32023-10-01-preview - Previous stable - GPT-4, function calling v12023-08-01-preview - Older version - GPT-3.5 Turbo 16K2023-06-01-preview - Legacy - GPT-3.5 Turbo, function calling preview2023-05-15 - GA version - GPT-3.5 Turbo, ChatGPTVersion selection guidelines:
2024-02-15-preview) for newest features2023-12-01-preview or newer GA)Feature availability by version:
Check Azure OpenAI API versions documentation for the current list and feature availability.
All models may throw these common exceptions during configuration and operation.
/**
* Thrown during model building if configuration is invalid.
* Common causes:
* - Missing required parameters (endpoint, deployment, auth)
* - Invalid parameter values (negative timeout, invalid URL)
* - Mutually exclusive options (maxRetries + retryOptions)
*/
class IllegalArgumentException extends RuntimeException {
// Examples:
// - "endpoint must not be null or empty"
// - "timeout must be positive"
// - "cannot specify both maxRetries and retryOptions"
}import dev.langchain4j.exception.ContentFilteredException;
import java.util.concurrent.TimeoutException;
// Content filtered by Azure safety policies
// Not retried automatically
class ContentFilteredException extends RuntimeException {}
// Request timeout exceeded
// Automatically retried if retry policy allows
class TimeoutException extends Exception {}
// Invalid request parameters or state
// Not retried
class IllegalArgumentException extends RuntimeException {}
// Network, API, authentication errors
// Retry behavior depends on HTTP status code
class RuntimeException extends Exception {}Error handling example:
try {
Response<?> response = model.generate(input);
} catch (ContentFilteredException e) {
// Content violated safety policy - do not retry
logger.warn("Content filtered: {}", e.getMessage());
// Prompt user to modify input or handle gracefully
} catch (TimeoutException e) {
// Request timed out - safe to retry with exponential backoff
logger.error("Request timed out after {}ms", timeout.toMillis());
// Implement retry with backoff or increase timeout
} catch (IllegalArgumentException e) {
// Invalid configuration or parameters - fix code
logger.error("Invalid configuration: {}", e.getMessage());
// Do not retry, fix the issue in code
} catch (RuntimeException e) {
// Network, API, or authentication error
logger.error("Unexpected error", e);
// Check if retryable based on cause and implement retry logic
}Never hardcode secrets:
// BAD - Hardcoded API key
.apiKey("1234567890abcdef1234567890abcdef")
// GOOD - Load from environment
.apiKey(System.getenv("AZURE_OPENAI_API_KEY"))
// BETTER - Use managed identity (no secrets)
.tokenCredential(new DefaultAzureCredentialBuilder().build())Recommendations:
Set appropriate timeouts:
// Match timeout to expected response time
.timeout(Duration.ofSeconds(60)) // Standard requests
.timeout(Duration.ofSeconds(180)) // Long-form generation
.timeout(Duration.ofSeconds(30)) // Quick responsesConfigure retry policies:
// Production: Aggressive retries for high availability
.maxRetries(5)
// Development: Fast failure for rapid iteration
.maxRetries(1)Recommendations:
Reuse model instances:
// GOOD - Create once, reuse across requests
private static final AzureOpenAiChatModel MODEL =
AzureOpenAiChatModel.builder()
.endpoint(endpoint)
.apiKey(apiKey)
.build();
// Use MODEL for all requestsDon't create per-request:
// BAD - Creates new instance for each request
for (String prompt : prompts) {
AzureOpenAiChatModel model = AzureOpenAiChatModel.builder()
.endpoint(endpoint)
.apiKey(apiKey)
.build();
model.generate(prompt); // Wasteful!
}Recommendations:
Use listeners over full logging:
// PREFERRED - Structured observability
.listeners(List.of(metricsListener, costListener))
// AVOID IN PRODUCTION - Full logging with sensitive data
.logRequestsAndResponses(true) // Only for developmentRecommendations:
Estimate before requesting:
// Estimate tokens before making expensive request
AzureOpenAiTokenCountEstimator estimator =
new AzureOpenAiTokenCountEstimator(AzureOpenAiChatModelName.GPT_4);
int estimatedTokens = estimator.estimateTokenCountInMessages(messages);
if (estimatedTokens > budget) {
// Trim messages or reject request
}Recommendations:
Install with Tessl CLI
npx tessl i tessl/maven-dev-langchain4j--langchain4j-azure-open-ai