tessl/maven-dev-langchain4j--langchain4j-anthropic

This package provides an integration layer between the LangChain4j framework and Anthropic's Claude language models, enabling Java developers to seamlessly incorporate Anthropic's AI capabilities into their applications.

Overview

Eval results

Files

Token Count Estimator

Name: tessl/maven-dev-langchain4j--langchain4j-anthropic
Author: tessl

The AnthropicTokenCountEstimator class estimates token counts for text and messages using Anthropic's token counting API. This is useful for managing costs and staying within model token limits.

Note: This is an experimental feature (since 1.4.0).

Capabilities

Creating a Token Count Estimator

Build an AnthropicTokenCountEstimator using the builder pattern.

package dev.langchain4j.model.anthropic;

import dev.langchain4j.model.TokenCountEstimator;
import dev.langchain4j.data.message.ChatMessage;

/**
 * Token counter for Anthropic models using Anthropic's token counting API.
 * Makes HTTP request to count tokens accurately per model.
 * Thread-safe after construction.
 * EXPERIMENTAL: API may change in future versions.
 *
 * @since 1.4.0
 */
public class AnthropicTokenCountEstimator implements TokenCountEstimator {
    /**
     * Creates new builder for token count estimator.
     *
     * @return new builder instance, never null
     */
    public static Builder builder();

    /**
     * Estimates token count for plain text string.
     * Makes synchronous API call to Anthropic's counting endpoint.
     *
     * @param text text to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if text is null
     * @throws RuntimeException if API call fails
     */
    public int estimateTokenCountInText(String text);

    /**
     * Estimates token count for single chat message.
     * Includes message metadata (role, etc.) in count.
     *
     * @param message chat message to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if message is null
     * @throws RuntimeException if API call fails, or no user messages and addDummyUserMessage not configured
     */
    public int estimateTokenCountInMessage(ChatMessage message);

    /**
     * Estimates token count for message sequence.
     * Includes conversation structure overhead in count.
     *
     * @param messages messages to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if messages is null
     * @throws RuntimeException if API call fails, or no user messages and addDummyUserMessage not configured
     */
    public int estimateTokenCountInMessages(Iterable<ChatMessage> messages);
}

Builder Configuration

Configure the token count estimator.

package dev.langchain4j.model.anthropic;

import dev.langchain4j.http.client.HttpClientBuilder;
import org.slf4j.Logger;
import java.time.Duration;

/**
 * Builder for AnthropicTokenCountEstimator.
 */
public static class Builder {
    /**
     * Sets API key for authentication.
     *
     * @param apiKey the API key, must not be null or empty
     * @return this builder, never null
     * @throws IllegalArgumentException if apiKey is null or empty
     * @default no default - REQUIRED parameter
     */
    public Builder apiKey(String apiKey);

    /**
     * Sets base URL for Anthropic API.
     *
     * @param baseUrl the base URL, must not be null
     * @return this builder, never null
     * @default "https://api.anthropic.com/v1/"
     */
    public Builder baseUrl(String baseUrl);

    /**
     * Sets API version header.
     *
     * @param version the API version, must not be null
     * @return this builder, never null
     * @default "2023-06-01"
     */
    public Builder version(String version);

    /**
     * Sets beta feature flags.
     *
     * @param beta beta identifiers, may be null
     * @return this builder, never null
     * @default null
     */
    public Builder beta(String beta);

    /**
     * Sets custom HTTP client builder.
     *
     * @param httpClientBuilder HTTP client builder, must not be null
     * @return this builder, never null
     * @default platform default
     */
    public Builder httpClientBuilder(HttpClientBuilder httpClientBuilder);

    /**
     * Sets request timeout.
     *
     * @param timeout timeout duration, must not be null, must be positive
     * @return this builder, never null
     * @throws IllegalArgumentException if timeout invalid
     * @default 30 seconds
     */
    public Builder timeout(Duration timeout);

    /**
     * Enables request logging.
     *
     * @param logRequests whether to log requests, may be null
     * @return this builder, never null
     * @default false
     */
    public Builder logRequests(Boolean logRequests);

    /**
     * Enables response logging.
     *
     * @param logResponses whether to log responses, may be null
     * @return this builder, never null
     * @default false
     */
    public Builder logResponses(Boolean logResponses);

    /**
     * Sets model name for token counting.
     * Token counts are model-specific; always use target model.
     *
     * @param modelName model identifier string, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if modelName is null
     * @default "claude-sonnet-4-5-20250929"
     */
    public Builder modelName(String modelName);

    /**
     * Sets model name using enum constant.
     *
     * @param modelName model name enum, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if modelName is null
     * @default CLAUDE_SONNET_4_5_20250929
     */
    public Builder modelName(AnthropicChatModelName modelName);

    /**
     * Automatically adds dummy user message if messages contain only system messages.
     * Anthropic API requires at least one user message for token counting.
     * Uses default dummy message "ping".
     *
     * @return this builder, never null
     * @default disabled
     */
    public Builder addDummyUserMessageIfNoUserMessages();

    /**
     * Automatically adds custom dummy user message if needed.
     *
     * @param dummyUserMessage custom dummy message text, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if dummyUserMessage is null
     * @default disabled
     */
    public Builder addDummyUserMessageIfNoUserMessages(String dummyUserMessage);

    /**
     * Builds the token count estimator.
     * Thread-safe after construction.
     *
     * @return configured estimator, never null
     * @throws IllegalStateException if apiKey or modelName missing
     */
    public AnthropicTokenCountEstimator build();
}

Basic Usage

Create an estimator and count tokens in text.

import dev.langchain4j.model.anthropic.AnthropicTokenCountEstimator;
import dev.langchain4j.model.anthropic.AnthropicChatModelName;

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .build();

// Count tokens in a string
String text = "What is the capital of France?";
int tokenCount = estimator.estimateTokenCountInText(text);
System.out.println("Token count: " + tokenCount);

Error Handling:

IllegalArgumentException: text is null
RuntimeException: API call fails (network, auth, rate limit)
RuntimeException: Invalid API key

Performance Notes:

Each count operation makes HTTP API call (~100-300ms latency)
Not suitable for real-time counting in request path
Consider caching results for frequently counted texts
Batch count operations when possible

Counting Tokens in Messages

Estimate tokens for conversation messages.

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.AiMessage;
import java.util.List;

// Count tokens in a single message
UserMessage userMsg = UserMessage.from("Tell me about Paris");
int msgTokens = estimator.estimateTokenCountInMessage(userMsg);

// Count tokens in a conversation
List<ChatMessage> messages = List.of(
    SystemMessage.from("You are a helpful travel guide."),
    UserMessage.from("What should I see in Paris?"),
    AiMessage.from("Paris has many attractions including the Eiffel Tower..."),
    UserMessage.from("Tell me more about the Eiffel Tower")
);

int totalTokens = estimator.estimateTokenCountInMessages(messages);
System.out.println("Total conversation tokens: " + totalTokens);

Error Handling:

IllegalArgumentException: message or messages is null
RuntimeException: No user messages (if addDummyUserMessage not configured)
RuntimeException: API call fails

Common Pitfalls:

❌ DON'T count in hot path

for (String text : texts) {
    int count = estimator.estimateTokenCountInText(text);  // N API calls!
}

✅ DO batch or cache

// Combine texts for single count
String combined = String.join("\n", texts);
int count = estimator.estimateTokenCountInText(combined);

Null Safety:

All methods throw IllegalArgumentException if inputs null
Return values are never null (primitive int >= 0)
Message lists can be empty (returns 0)

Handling System-Only Messages

By default, Anthropic's API requires at least one user message. Use addDummyUserMessageIfNoUserMessages() to automatically add a dummy user message when counting tokens for system messages only.

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .addDummyUserMessageIfNoUserMessages()  // Auto-add dummy message
    .build();

// This will work now (dummy "ping" message added automatically)
List<ChatMessage> systemOnly = List.of(
    SystemMessage.from("You are a helpful assistant."),
    SystemMessage.from("Be concise and accurate.")
);

int tokens = estimator.estimateTokenCountInMessages(systemOnly);
System.out.println("System message tokens: " + tokens);

Error Handling:

Without addDummyUserMessage: RuntimeException if no user messages
With addDummyUserMessage: Automatically adds "ping" and succeeds
Custom dummy message: Use addDummyUserMessageIfNoUserMessages(String)

Common Pitfalls:

❌ DON'T count system-only without dummy message

// Throws RuntimeException
estimator.estimateTokenCountInMessages(List.of(
    SystemMessage.from("System prompt")
));

✅ DO enable dummy message feature

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(modelName)
    .addDummyUserMessageIfNoUserMessages()
    .build();

Token Count Accuracy:

Dummy message tokens included in count (~1 token for "ping")
Subtract 1 from result if you need pure system message count
Dummy message only added if no user messages present

Custom Dummy Message

Specify a custom dummy user message.

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .addDummyUserMessageIfNoUserMessages("hello")  // Custom dummy message
    .build();

Error Handling:

IllegalArgumentException: dummyUserMessage is null
Empty string is valid (0-1 tokens)

Use Cases:

Match production dummy message pattern
Control token overhead from dummy message
Consistency with actual API calls

Configuration Options

Full configuration example.

import java.time.Duration;

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))  // Required
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)  // Required
    .baseUrl("https://api.anthropic.com/v1/")  // Optional, default shown
    .version("2023-06-01")                     // Optional, default shown
    .timeout(Duration.ofSeconds(30))           // Optional
    .logRequests(false)                        // Optional, default false
    .logResponses(false)                       // Optional, default false
    .addDummyUserMessageIfNoUserMessages()     // Optional
    .build();

Error Handling:

IllegalStateException: apiKey or modelName missing
IllegalArgumentException: Invalid timeout or null required parameter
RuntimeException: Network or API errors at count time (not build time)

Thread Safety:

Builder NOT thread-safe
Built estimator IS thread-safe
Safe to reuse estimator across threads

Using for Cost Estimation

Estimate costs before making API calls.

// Assume pricing (example rates - check current Anthropic pricing)
double inputTokenPrice = 0.003;  // $0.003 per 1K tokens
double outputTokenPrice = 0.015;  // $0.015 per 1K tokens
int maxOutputTokens = 2048;

// Count input tokens
int inputTokens = estimator.estimateTokenCountInMessages(messages);

// Estimate cost
double estimatedInputCost = (inputTokens / 1000.0) * inputTokenPrice;
double estimatedOutputCost = (maxOutputTokens / 1000.0) * outputTokenPrice;
double totalEstimatedCost = estimatedInputCost + estimatedOutputCost;

System.out.println("Estimated cost: $" + String.format("%.4f", totalEstimatedCost));

// Proceed with request if cost is acceptable
if (totalEstimatedCost < 0.10) {
    ChatResponse response = model.chat(ChatRequest.builder()
        .messages(messages)
        .build());
}

Error Handling:

API call may fail but cost estimation continues with cached/default values
Consider fallback token estimates for error cases
Log estimation failures for monitoring

Common Pitfalls:

❌ DON'T use outdated pricing

double price = 0.001;  // May be stale

✅ DO fetch pricing dynamically or document update date

// Pricing as of 2025-02-27
double price = getCurrentPricing("claude-sonnet-4-5");

Accuracy Notes:

Input token count: Accurate estimate from API
Output token count: Maximum possible (actual may be less)
Total cost: Upper bound (actual typically lower)
Cache savings not reflected (requires actual API metadata)

Staying Within Token Limits

Check if messages exceed model limits.

int maxModelTokens = 200000;  // Claude 3.5 Sonnet context window
int maxOutputTokens = 4096;

int inputTokens = estimator.estimateTokenCountInMessages(messages);
int availableTokens = maxModelTokens - maxOutputTokens;

if (inputTokens > availableTokens) {
    System.err.println("Input exceeds token limit!");
    System.err.println("Input tokens: " + inputTokens);
    System.err.println("Available: " + availableTokens);

    // Trim messages or use summarization
    messages = trimMessages(messages, availableTokens);
}

Error Handling:

Exceeding limits causes RuntimeException at API call time (not at count time)
Implement trimming/summarization before exceeding limits
Monitor token usage trends to prevent limit issues

Common Pitfalls:

❌ DON'T ignore token limits

model.chat(request);  // May fail with token limit error

✅ DO check before sending

if (estimator.estimateTokenCountInMessages(messages) < maxTokens) {
    model.chat(request);
} else {
    // Trim or summarize
}

Trimming Strategies:

Remove oldest messages (sliding window)
Summarize conversation history
Split into multiple shorter requests
Prioritize recent and important messages

Model-Specific Token Counting

Different models may have different tokenization. Always use the target model for accurate counts.

// For Opus model
AnthropicTokenCountEstimator opusEstimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_OPUS_4_5_20251101)
    .build();

// For Haiku model
AnthropicTokenCountEstimator haikuEstimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_HAIKU_4_5_20251001)
    .build();

String text = "Same text for both models";
int opusTokens = opusEstimator.estimateTokenCountInText(text);
int haikuTokens = haikuEstimator.estimateTokenCountInText(text);

// Token counts may differ slightly between models
System.out.println("Opus tokens: " + opusTokens);
System.out.println("Haiku tokens: " + haikuTokens);

Error Handling:

Different models may return different counts for same text
Always match estimator model to target chat model
Mismatched models lead to inaccurate budgeting

Common Pitfalls:

❌ DON'T reuse estimator across models

// Estimator configured for Sonnet
int count = sonnetEstimator.estimateTokenCountInText(text);
// Use count for Opus model - WRONG!
opusModel.chat(request);

✅ DO create estimator per model

int count = opusEstimator.estimateTokenCountInText(text);
opusModel.chat(request);  // Correct

Tokenization Differences:

Typically within 1-5% between models
Larger differences for non-English text
Image/PDF token counts model-dependent
Special tokens may vary by model

Types