CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-anthropic

This package provides an integration layer between the LangChain4j framework and Anthropic's Claude language models, enabling Java developers to seamlessly incorporate Anthropic's AI capabilities into their applications.

Overview
Eval results
Files

token-count-estimator.mddocs/

Token Count Estimator

The AnthropicTokenCountEstimator class estimates token counts for text and messages using Anthropic's token counting API. This is useful for managing costs and staying within model token limits.

Note: This is an experimental feature (since 1.4.0).

Capabilities

Creating a Token Count Estimator

Build an AnthropicTokenCountEstimator using the builder pattern.

package dev.langchain4j.model.anthropic;

import dev.langchain4j.model.TokenCountEstimator;
import dev.langchain4j.data.message.ChatMessage;

/**
 * Token counter for Anthropic models using Anthropic's token counting API.
 * Makes HTTP request to count tokens accurately per model.
 * Thread-safe after construction.
 * EXPERIMENTAL: API may change in future versions.
 *
 * @since 1.4.0
 */
public class AnthropicTokenCountEstimator implements TokenCountEstimator {
    /**
     * Creates new builder for token count estimator.
     *
     * @return new builder instance, never null
     */
    public static Builder builder();

    /**
     * Estimates token count for plain text string.
     * Makes synchronous API call to Anthropic's counting endpoint.
     *
     * @param text text to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if text is null
     * @throws RuntimeException if API call fails
     */
    public int estimateTokenCountInText(String text);

    /**
     * Estimates token count for single chat message.
     * Includes message metadata (role, etc.) in count.
     *
     * @param message chat message to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if message is null
     * @throws RuntimeException if API call fails, or no user messages and addDummyUserMessage not configured
     */
    public int estimateTokenCountInMessage(ChatMessage message);

    /**
     * Estimates token count for message sequence.
     * Includes conversation structure overhead in count.
     *
     * @param messages messages to count, must not be null
     * @return estimated token count, >= 0
     * @throws IllegalArgumentException if messages is null
     * @throws RuntimeException if API call fails, or no user messages and addDummyUserMessage not configured
     */
    public int estimateTokenCountInMessages(Iterable<ChatMessage> messages);
}

Builder Configuration

Configure the token count estimator.

package dev.langchain4j.model.anthropic;

import dev.langchain4j.http.client.HttpClientBuilder;
import org.slf4j.Logger;
import java.time.Duration;

/**
 * Builder for AnthropicTokenCountEstimator.
 */
public static class Builder {
    /**
     * Sets API key for authentication.
     *
     * @param apiKey the API key, must not be null or empty
     * @return this builder, never null
     * @throws IllegalArgumentException if apiKey is null or empty
     * @default no default - REQUIRED parameter
     */
    public Builder apiKey(String apiKey);

    /**
     * Sets base URL for Anthropic API.
     *
     * @param baseUrl the base URL, must not be null
     * @return this builder, never null
     * @default "https://api.anthropic.com/v1/"
     */
    public Builder baseUrl(String baseUrl);

    /**
     * Sets API version header.
     *
     * @param version the API version, must not be null
     * @return this builder, never null
     * @default "2023-06-01"
     */
    public Builder version(String version);

    /**
     * Sets beta feature flags.
     *
     * @param beta beta identifiers, may be null
     * @return this builder, never null
     * @default null
     */
    public Builder beta(String beta);

    /**
     * Sets custom HTTP client builder.
     *
     * @param httpClientBuilder HTTP client builder, must not be null
     * @return this builder, never null
     * @default platform default
     */
    public Builder httpClientBuilder(HttpClientBuilder httpClientBuilder);

    /**
     * Sets request timeout.
     *
     * @param timeout timeout duration, must not be null, must be positive
     * @return this builder, never null
     * @throws IllegalArgumentException if timeout invalid
     * @default 30 seconds
     */
    public Builder timeout(Duration timeout);

    /**
     * Enables request logging.
     *
     * @param logRequests whether to log requests, may be null
     * @return this builder, never null
     * @default false
     */
    public Builder logRequests(Boolean logRequests);

    /**
     * Enables response logging.
     *
     * @param logResponses whether to log responses, may be null
     * @return this builder, never null
     * @default false
     */
    public Builder logResponses(Boolean logResponses);

    /**
     * Sets model name for token counting.
     * Token counts are model-specific; always use target model.
     *
     * @param modelName model identifier string, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if modelName is null
     * @default "claude-sonnet-4-5-20250929"
     */
    public Builder modelName(String modelName);

    /**
     * Sets model name using enum constant.
     *
     * @param modelName model name enum, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if modelName is null
     * @default CLAUDE_SONNET_4_5_20250929
     */
    public Builder modelName(AnthropicChatModelName modelName);

    /**
     * Automatically adds dummy user message if messages contain only system messages.
     * Anthropic API requires at least one user message for token counting.
     * Uses default dummy message "ping".
     *
     * @return this builder, never null
     * @default disabled
     */
    public Builder addDummyUserMessageIfNoUserMessages();

    /**
     * Automatically adds custom dummy user message if needed.
     *
     * @param dummyUserMessage custom dummy message text, must not be null
     * @return this builder, never null
     * @throws IllegalArgumentException if dummyUserMessage is null
     * @default disabled
     */
    public Builder addDummyUserMessageIfNoUserMessages(String dummyUserMessage);

    /**
     * Builds the token count estimator.
     * Thread-safe after construction.
     *
     * @return configured estimator, never null
     * @throws IllegalStateException if apiKey or modelName missing
     */
    public AnthropicTokenCountEstimator build();
}

Basic Usage

Create an estimator and count tokens in text.

import dev.langchain4j.model.anthropic.AnthropicTokenCountEstimator;
import dev.langchain4j.model.anthropic.AnthropicChatModelName;

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .build();

// Count tokens in a string
String text = "What is the capital of France?";
int tokenCount = estimator.estimateTokenCountInText(text);
System.out.println("Token count: " + tokenCount);

Error Handling:

  • IllegalArgumentException: text is null
  • RuntimeException: API call fails (network, auth, rate limit)
  • RuntimeException: Invalid API key

Performance Notes:

  • Each count operation makes HTTP API call (~100-300ms latency)
  • Not suitable for real-time counting in request path
  • Consider caching results for frequently counted texts
  • Batch count operations when possible

Counting Tokens in Messages

Estimate tokens for conversation messages.

import dev.langchain4j.data.message.UserMessage;
import dev.langchain4j.data.message.SystemMessage;
import dev.langchain4j.data.message.AiMessage;
import java.util.List;

// Count tokens in a single message
UserMessage userMsg = UserMessage.from("Tell me about Paris");
int msgTokens = estimator.estimateTokenCountInMessage(userMsg);

// Count tokens in a conversation
List<ChatMessage> messages = List.of(
    SystemMessage.from("You are a helpful travel guide."),
    UserMessage.from("What should I see in Paris?"),
    AiMessage.from("Paris has many attractions including the Eiffel Tower..."),
    UserMessage.from("Tell me more about the Eiffel Tower")
);

int totalTokens = estimator.estimateTokenCountInMessages(messages);
System.out.println("Total conversation tokens: " + totalTokens);

Error Handling:

  • IllegalArgumentException: message or messages is null
  • RuntimeException: No user messages (if addDummyUserMessage not configured)
  • RuntimeException: API call fails

Common Pitfalls:

❌ DON'T count in hot path

for (String text : texts) {
    int count = estimator.estimateTokenCountInText(text);  // N API calls!
}

✅ DO batch or cache

// Combine texts for single count
String combined = String.join("\n", texts);
int count = estimator.estimateTokenCountInText(combined);

Null Safety:

  • All methods throw IllegalArgumentException if inputs null
  • Return values are never null (primitive int >= 0)
  • Message lists can be empty (returns 0)

Handling System-Only Messages

By default, Anthropic's API requires at least one user message. Use addDummyUserMessageIfNoUserMessages() to automatically add a dummy user message when counting tokens for system messages only.

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .addDummyUserMessageIfNoUserMessages()  // Auto-add dummy message
    .build();

// This will work now (dummy "ping" message added automatically)
List<ChatMessage> systemOnly = List.of(
    SystemMessage.from("You are a helpful assistant."),
    SystemMessage.from("Be concise and accurate.")
);

int tokens = estimator.estimateTokenCountInMessages(systemOnly);
System.out.println("System message tokens: " + tokens);

Error Handling:

  • Without addDummyUserMessage: RuntimeException if no user messages
  • With addDummyUserMessage: Automatically adds "ping" and succeeds
  • Custom dummy message: Use addDummyUserMessageIfNoUserMessages(String)

Common Pitfalls:

❌ DON'T count system-only without dummy message

// Throws RuntimeException
estimator.estimateTokenCountInMessages(List.of(
    SystemMessage.from("System prompt")
));

✅ DO enable dummy message feature

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(modelName)
    .addDummyUserMessageIfNoUserMessages()
    .build();

Token Count Accuracy:

  • Dummy message tokens included in count (~1 token for "ping")
  • Subtract 1 from result if you need pure system message count
  • Dummy message only added if no user messages present

Custom Dummy Message

Specify a custom dummy user message.

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)
    .addDummyUserMessageIfNoUserMessages("hello")  // Custom dummy message
    .build();

Error Handling:

  • IllegalArgumentException: dummyUserMessage is null
  • Empty string is valid (0-1 tokens)

Use Cases:

  • Match production dummy message pattern
  • Control token overhead from dummy message
  • Consistency with actual API calls

Configuration Options

Full configuration example.

import java.time.Duration;

AnthropicTokenCountEstimator estimator = AnthropicTokenCountEstimator.builder()
    .apiKey(System.getenv("ANTHROPIC_API_KEY"))  // Required
    .modelName(AnthropicChatModelName.CLAUDE_SONNET_4_5_20250929)  // Required
    .baseUrl("https://api.anthropic.com/v1/")  // Optional, default shown
    .version("2023-06-01")                     // Optional, default shown
    .timeout(Duration.ofSeconds(30))           // Optional
    .logRequests(false)                        // Optional, default false
    .logResponses(false)                       // Optional, default false
    .addDummyUserMessageIfNoUserMessages()     // Optional
    .build();

Error Handling:

  • IllegalStateException: apiKey or modelName missing
  • IllegalArgumentException: Invalid timeout or null required parameter
  • RuntimeException: Network or API errors at count time (not build time)

Thread Safety:

  • Builder NOT thread-safe
  • Built estimator IS thread-safe
  • Safe to reuse estimator across threads

Using for Cost Estimation

Estimate costs before making API calls.

// Assume pricing (example rates - check current Anthropic pricing)
double inputTokenPrice = 0.003;  // $0.003 per 1K tokens
double outputTokenPrice = 0.015;  // $0.015 per 1K tokens
int maxOutputTokens = 2048;

// Count input tokens
int inputTokens = estimator.estimateTokenCountInMessages(messages);

// Estimate cost
double estimatedInputCost = (inputTokens / 1000.0) * inputTokenPrice;
double estimatedOutputCost = (maxOutputTokens / 1000.0) * outputTokenPrice;
double totalEstimatedCost = estimatedInputCost + estimatedOutputCost;

System.out.println("Estimated cost: $" + String.format("%.4f", totalEstimatedCost));

// Proceed with request if cost is acceptable
if (totalEstimatedCost < 0.10) {
    ChatResponse response = model.chat(ChatRequest.builder()
        .messages(messages)
        .build());
}

Error Handling:

  • API call may fail but cost estimation continues with cached/default values
  • Consider fallback token estimates for error cases
  • Log estimation failures for monitoring

Common Pitfalls:

❌ DON'T use outdated pricing

double price = 0.001;  // May be stale

✅ DO fetch pricing dynamically or document update date

// Pricing as of 2025-02-27
double price = getCurrentPricing("claude-sonnet-4-5");

Accuracy Notes:

  • Input token count: Accurate estimate from API
  • Output token count: Maximum possible (actual may be less)
  • Total cost: Upper bound (actual typically lower)
  • Cache savings not reflected (requires actual API metadata)

Staying Within Token Limits

Check if messages exceed model limits.

int maxModelTokens = 200000;  // Claude 3.5 Sonnet context window
int maxOutputTokens = 4096;

int inputTokens = estimator.estimateTokenCountInMessages(messages);
int availableTokens = maxModelTokens - maxOutputTokens;

if (inputTokens > availableTokens) {
    System.err.println("Input exceeds token limit!");
    System.err.println("Input tokens: " + inputTokens);
    System.err.println("Available: " + availableTokens);

    // Trim messages or use summarization
    messages = trimMessages(messages, availableTokens);
}

Error Handling:

  • Exceeding limits causes RuntimeException at API call time (not at count time)
  • Implement trimming/summarization before exceeding limits
  • Monitor token usage trends to prevent limit issues

Common Pitfalls:

❌ DON'T ignore token limits

model.chat(request);  // May fail with token limit error

✅ DO check before sending

if (estimator.estimateTokenCountInMessages(messages) < maxTokens) {
    model.chat(request);
} else {
    // Trim or summarize
}

Trimming Strategies:

  • Remove oldest messages (sliding window)
  • Summarize conversation history
  • Split into multiple shorter requests
  • Prioritize recent and important messages

Model-Specific Token Counting

Different models may have different tokenization. Always use the target model for accurate counts.

// For Opus model
AnthropicTokenCountEstimator opusEstimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_OPUS_4_5_20251101)
    .build();

// For Haiku model
AnthropicTokenCountEstimator haikuEstimator = AnthropicTokenCountEstimator.builder()
    .apiKey(apiKey)
    .modelName(AnthropicChatModelName.CLAUDE_HAIKU_4_5_20251001)
    .build();

String text = "Same text for both models";
int opusTokens = opusEstimator.estimateTokenCountInText(text);
int haikuTokens = haikuEstimator.estimateTokenCountInText(text);

// Token counts may differ slightly between models
System.out.println("Opus tokens: " + opusTokens);
System.out.println("Haiku tokens: " + haikuTokens);

Error Handling:

  • Different models may return different counts for same text
  • Always match estimator model to target chat model
  • Mismatched models lead to inaccurate budgeting

Common Pitfalls:

❌ DON'T reuse estimator across models

// Estimator configured for Sonnet
int count = sonnetEstimator.estimateTokenCountInText(text);
// Use count for Opus model - WRONG!
opusModel.chat(request);

✅ DO create estimator per model

int count = opusEstimator.estimateTokenCountInText(text);
opusModel.chat(request);  // Correct

Tokenization Differences:

  • Typically within 1-5% between models
  • Larger differences for non-English text
  • Image/PDF token counts model-dependent
  • Special tokens may vary by model

Types

TokenCountEstimator

package dev.langchain4j.model;

import dev.langchain4j.data.message.ChatMessage;

/**
 * Interface for token counting across providers.
 */
public interface TokenCountEstimator {
    /**
     * Estimates tokens in text.
     *
     * @param text text to count, must not be null
     * @return token count, >= 0
     * @throws IllegalArgumentException if text is null
     * @throws RuntimeException if counting fails
     */
    int estimateTokenCountInText(String text);

    /**
     * Estimates tokens in single message.
     *
     * @param message message to count, must not be null
     * @return token count, >= 0
     * @throws IllegalArgumentException if message is null
     * @throws RuntimeException if counting fails
     */
    int estimateTokenCountInMessage(ChatMessage message);

    /**
     * Estimates tokens in message sequence.
     *
     * @param messages messages to count, must not be null
     * @return token count, >= 0
     * @throws IllegalArgumentException if messages is null
     * @throws RuntimeException if counting fails
     */
    int estimateTokenCountInMessages(Iterable<ChatMessage> messages);
}

ChatMessage

Base interface for all message types (from langchain4j-core).

package dev.langchain4j.data.message;

/**
 * Base interface for chat messages.
 */
public interface ChatMessage {
    /**
     * Returns message type (SYSTEM, USER, AI, TOOL).
     *
     * @return message type, never null
     */
    ChatMessageType type();

    /**
     * Returns message text content.
     *
     * @return text content, may be null for some message types
     */
    String text();
}

Notes

  • Token counting requires an API call to Anthropic's service
  • The estimator returns input token counts only (no output estimation)
  • System messages are counted separately from user/assistant messages
  • Token counts are estimates and may vary slightly from actual usage
  • Always specify the same model name as your chat model for accurate counts
  • The default dummy message is "ping" when using addDummyUserMessageIfNoUserMessages()
  • Image and PDF content tokens are also counted by the API
  • This is an experimental API and may change in future versions
  • Counts include message structure overhead (role markers, etc.)
  • API rate limits apply to counting requests
  • Default timeout is 30 seconds (lower than chat model default of 60s)

Resource Lifecycle:

  • Create once and reuse (thread-safe)
  • No explicit cleanup required
  • HTTP connections managed automatically

Thread Safety:

  • Estimator instance is thread-safe
  • Safe for concurrent counting from multiple threads
  • Each count makes independent HTTP request

Performance Characteristics:

  • Latency: 100-300ms per count operation
  • Not suitable for per-request counting in hot path
  • Consider caching for frequently counted texts
  • Batch counting when possible (combine texts)

Experimental Status:

  • API stable but marked experimental
  • May add features in future (e.g., output token estimation)
  • Breaking changes possible but unlikely
  • Monitor library release notes for changes

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-anthropic@1.11.0

docs

chat-model.md

content-types.md

index.md

model-catalog.md

model-names.md

response-metadata.md

streaming-chat-model.md

token-count-estimator.md

tools.md

tile.json