CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-openai

Quarkus LangChain4j OpenAI extension provides seamless integration between Quarkus and OpenAI's Large Language Models, enabling developers to easily incorporate LLMs into their applications with support for chat, streaming, embeddings, moderation, and image generation.

Overview
Eval results
Files

cost-estimation.mddocs/

Cost Estimation

Automatic cost estimation for OpenAI API usage based on token consumption. The extension provides pre-built cost estimators for common OpenAI models that automatically calculate costs when token usage information is available.

Overview

Cost estimators are CDI-managed singleton beans that implement the CostEstimator interface from the LangChain4j framework. They are registered with @Priority(Integer.MIN_VALUE) to serve as default estimators when no custom implementation is provided. When a model returns token usage information, the framework automatically finds and uses the appropriate estimator to calculate costs.

Key Features

  • Automatic Cost Calculation - Costs are calculated automatically when token usage is available
  • Model-Specific Pricing - Each estimator knows the pricing for specific OpenAI models
  • Separate Input/Output Rates - Chat models track different rates for input and output tokens
  • CDI Integration - Estimators are injectable singleton beans
  • Extensible - Custom estimators can be added by implementing the CostEstimator interface

Import Statements

import io.quarkiverse.langchain4j.openai.runtime.cost.*;
import io.quarkiverse.langchain4j.cost.CostEstimator;
import io.quarkiverse.langchain4j.cost.CostEstimator.CostContext;
import io.quarkiverse.langchain4j.cost.CostEstimator.SupportsContext;
import io.quarkiverse.langchain4j.cost.CostEstimator.CostResult;

Capabilities

GPT-4o Cost Estimator

Provides cost estimation for the GPT-4o model.

/**
 * Cost estimator for GPT-4o model.
 *
 * Supported Model: gpt-4o
 * Pricing:
 *   - Input: $5.00 per 1 million tokens
 *   - Output: $15.00 per 1 million tokens
 *
 * This estimator is automatically used when a GPT-4o chat model returns
 * token usage information.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicGpt4oCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "gpt-4o", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage (input/output tokens)
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = (inputTokens * 5.0 / 1_000_000) + (outputTokens * 15.0 / 1_000_000)
     */
    public CostResult estimate(CostContext context);
}

GPT-4o Mini Cost Estimator

Provides cost estimation for the GPT-4o-mini model.

/**
 * Cost estimator for GPT-4o-mini model.
 *
 * Supported Model: gpt-4o-mini
 * Pricing:
 *   - Input: $0.15 per 1 million tokens
 *   - Output: $0.60 per 1 million tokens
 *
 * GPT-4o-mini is a cost-effective model suitable for many use cases,
 * with pricing approximately 30x cheaper than GPT-4o for input tokens
 * and 25x cheaper for output tokens.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicGpt4oMiniCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "gpt-4o-mini", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage (input/output tokens)
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = (inputTokens * 0.15 / 1_000_000) + (outputTokens * 0.60 / 1_000_000)
     */
    public CostResult estimate(CostContext context);
}

O1 Mini Cost Estimator

Provides cost estimation for the O1-mini reasoning model.

/**
 * Cost estimator for O1-mini model.
 *
 * Supported Model: o1-mini
 * Pricing:
 *   - Input: $3.00 per 1 million tokens
 *   - Output: $12.00 per 1 million tokens
 *
 * The O1-mini model provides reasoning capabilities at a more accessible
 * price point than the full O1 model, suitable for applications requiring
 * structured reasoning and problem-solving.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicO1MiniCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "o1-mini", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage (input/output tokens)
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = (inputTokens * 3.0 / 1_000_000) + (outputTokens * 12.0 / 1_000_000)
     */
    public CostResult estimate(CostContext context);
}

O1 Preview Cost Estimator

Provides cost estimation for the O1-preview reasoning model.

/**
 * Cost estimator for O1-preview model.
 *
 * Supported Model: o1-preview
 * Pricing:
 *   - Input: $15.00 per 1 million tokens
 *   - Output: $60.00 per 1 million tokens
 *
 * The O1-preview model offers advanced reasoning capabilities with extended
 * thinking time and complex problem-solving abilities. It is OpenAI's most
 * capable reasoning model, optimized for tasks requiring multi-step reasoning
 * and deep analysis.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicO1PreviewCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "o1-preview", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage (input/output tokens)
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = (inputTokens * 15.0 / 1_000_000) + (outputTokens * 60.0 / 1_000_000)
     */
    public CostResult estimate(CostContext context);
}

Text Embedding 3 Small Cost Estimator

Provides cost estimation for the text-embedding-3-small embedding model.

/**
 * Cost estimator for text-embedding-3-small model.
 *
 * Supported Model: text-embedding-3-small
 * Pricing:
 *   - Cost: $0.02 per 1 million tokens
 *
 * This embedding model provides high-quality embeddings at a very low cost,
 * suitable for semantic search, similarity detection, and RAG applications
 * where cost efficiency is important.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicE3SmallCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "text-embedding-3-small", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage information
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = totalTokens * 0.02 / 1_000_000
     */
    public CostResult estimate(CostContext context);
}

Text Embedding 3 Large Cost Estimator

Provides cost estimation for the text-embedding-3-large embedding model.

/**
 * Cost estimator for text-embedding-3-large model.
 *
 * Supported Model: text-embedding-3-large
 * Pricing:
 *   - Cost: $0.13 per 1 million tokens
 *
 * This embedding model provides the highest quality embeddings available
 * from OpenAI, with larger dimensionality (3072 vs 1536) and improved
 * performance on semantic understanding tasks. The higher cost reflects
 * the enhanced capabilities.
 */
@Singleton
@Priority(Integer.MIN_VALUE)
public class BasicE3BigCostEstimator implements CostEstimator {

    /**
     * Check if this estimator supports the given context.
     *
     * Parameters:
     *     context - Contains model name and type information
     *
     * Returns:
     *     true if model name is "text-embedding-3-large", false otherwise
     */
    public boolean supports(SupportsContext context);

    /**
     * Estimate the cost for the given usage context.
     *
     * Parameters:
     *     context - Contains token usage information
     *
     * Returns:
     *     CostResult with calculated cost in USD
     *
     * Calculation:
     *     cost = totalTokens * 0.13 / 1_000_000
     */
    public CostResult estimate(CostContext context);
}

Usage

Automatic Cost Estimation

Cost estimation happens automatically when models return token usage information:

import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.data.message.AiMessage;
import jakarta.inject.Inject;

@ApplicationScoped
public class ChatService {
    @Inject
    ChatModel chatModel;  // Uses gpt-4o-mini by default

    public void generateWithCostTracking(String prompt) {
        Response<AiMessage> response = chatModel.generate(prompt);

        // Token usage is included in response
        if (response.tokenUsage() != null) {
            int inputTokens = response.tokenUsage().inputTokenCount();
            int outputTokens = response.tokenUsage().outputTokenCount();

            // Cost is automatically estimated by BasicGpt4oMiniCostEstimator
            // Framework calculates: (inputTokens * 0.15 + outputTokens * 0.60) / 1_000_000

            System.out.println("Input tokens: " + inputTokens);
            System.out.println("Output tokens: " + outputTokens);
            System.out.println("Estimated cost: $" +
                ((inputTokens * 0.15 + outputTokens * 0.60) / 1_000_000));
        }
    }
}

Custom Cost Estimators

You can provide custom estimators for unsupported models or different pricing:

import io.quarkiverse.langchain4j.cost.CostEstimator;
import jakarta.enterprise.context.ApplicationScoped;
import jakarta.annotation.Priority;

@ApplicationScoped
@Priority(100)  // Higher priority than built-in estimators
public class CustomGpt4Estimator implements CostEstimator {

    @Override
    public boolean supports(SupportsContext context) {
        return "gpt-4".equals(context.modelName());
    }

    @Override
    public CostResult estimate(CostContext context) {
        // Custom pricing calculation
        double inputCost = context.inputTokens() * 30.0 / 1_000_000;
        double outputCost = context.outputTokens() * 60.0 / 1_000_000;
        return new CostResult(inputCost + outputCost);
    }
}

Monitoring Costs Across Models

Cost estimation works consistently across all model types:

import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.data.segment.TextSegment;
import jakarta.inject.Inject;

@ApplicationScoped
public class MultiModelService {
    @Inject
    ChatModel chatModel;  // gpt-4o-mini: $0.15 input / $0.60 output per 1M tokens

    @Inject
    EmbeddingModel embeddingModel;  // text-embedding-3-small: $0.02 per 1M tokens

    public void processWithCostTracking(String text, List<String> documents) {
        // Chat request
        Response<AiMessage> chatResponse = chatModel.generate(text);
        // Cost automatically estimated by BasicGpt4oMiniCostEstimator

        // Embedding requests
        List<TextSegment> segments = documents.stream()
            .map(TextSegment::from)
            .collect(Collectors.toList());
        Response<List<Embedding>> embeddingResponse = embeddingModel.embedAll(segments);
        // Cost automatically estimated by BasicE3SmallCostEstimator
    }
}

Notes

  • Pricing Accuracy: Prices are accurate as of the extension release date (version 1.7.4). OpenAI may change pricing over time. Always verify current pricing at https://openai.com/pricing
  • Token Counting: Costs are calculated based on token counts returned by OpenAI's API. Actual token counts may vary slightly from estimates
  • Custom Models: For custom or fine-tuned models, implement a custom CostEstimator with appropriate pricing
  • OpenAI-Compatible APIs: Cost estimators assume OpenAI pricing. Different providers may have different pricing structures
  • Observability: Cost information can be integrated with monitoring systems for budget tracking and cost optimization

Install with Tessl CLI

npx tessl i tessl/maven-io-quarkiverse-langchain4j--quarkus-langchain4j-openai

docs

chat-models.md

configuration.md

cost-estimation.md

dev-ui-services.md

embedding-models.md

image-models.md

index.md

moderation-models.md

tile.json