tessl/maven-dev-langchain4j--langchain4j-bedrock

AWS Bedrock integration for LangChain4j enabling Java applications to interact with various LLM providers through a unified interface

Overview

Eval results

Files

Simple Caching with BedrockCachePointPlacement

Name: tessl/maven-dev-langchain4j--langchain4j-bedrock
Author: tessl

Automatic cache point placement for entirely static content.

Overview

BedrockCachePointPlacement automatically inserts cache points at predefined locations. Use this when all cached content is completely static.

public enum BedrockCachePointPlacement {
    AFTER_SYSTEM,        // Cache after system messages
    AFTER_USER_MESSAGE,  // Cache after first user message
    AFTER_TOOLS         // Cache after tool definitions
}

Important: Only works with standard SystemMessage. For BedrockSystemMessage, use granular caching instead.

AFTER_SYSTEM

Cache the system message content.

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_SYSTEM)
    .build();

BedrockChatModel model = BedrockChatModel.builder()
    .modelId("anthropic.claude-3-5-sonnet-20241022-v2:0")
    .defaultRequestParameters(params)
    .build();

String systemPrompt = loadLargeStaticPrompt(); // >1024 tokens

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(
        SystemMessage.from(systemPrompt),
        UserMessage.from("Question")
    )
    .build());

Use when: You have a large, static system prompt that doesn't change between requests.

AFTER_USER_MESSAGE

Cache system message plus the first user message.

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_USER_MESSAGE)
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(
        SystemMessage.from("Large system context..."),
        UserMessage.from("Initial user context...")  // Also cached
    )
    .build());

Use when: You have static system instructions plus an initial context message that's consistent across requests.

AFTER_TOOLS

Cache tool definitions.

import dev.langchain4j.agent.tool.Tool;

class Calculator {
    @Tool("Add two numbers")
    int add(int a, int b) { return a + b; }

    @Tool("Multiply two numbers")
    int multiply(int a, int b) { return a * b; }
}

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_TOOLS)
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(UserMessage.from("What is 5 + 3?"))
    .toolSpecifications(toolsFrom(new Calculator()))  // Tool defs cached
    .build());

Use when: You have static tool definitions that don't change between requests.

Combining Placements

Only one placement can be active at a time. To cache multiple sections, use granular caching.

Token Usage Tracking

import dev.langchain4j.model.bedrock.BedrockTokenUsage;

// First request
ChatResponse response1 = model.chat(request);
BedrockTokenUsage usage1 = (BedrockTokenUsage) response1.tokenUsage();
System.out.println("Cache write: " + usage1.cacheWriteInputTokens());

// Second request (within 5 minutes)
ChatResponse response2 = model.chat(request);
BedrockTokenUsage usage2 = (BedrockTokenUsage) response2.tokenUsage();
System.out.println("Cache read: " + usage2.cacheReadInputTokens());

Limitations

Static content only: Any change invalidates the cache
Standard SystemMessage only: Does not work with BedrockSystemMessage
One cache point: Can only cache one section
Minimum size: Content must exceed ~1,024 tokens

Next: Granular Caching for mixed static/dynamic content

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-bedrock@1.11.0

tessl/maven-dev-langchain4j--langchain4j-bedrock

simple-caching.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/features/prompt-caching/

Simple Caching with BedrockCachePointPlacement

Overview

AFTER_SYSTEM

AFTER_USER_MESSAGE

AFTER_TOOLS

Combining Placements

Token Usage Tracking

Limitations

simple-caching.mddocs/features/prompt-caching/