CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-dev-langchain4j--langchain4j-bedrock

AWS Bedrock integration for LangChain4j enabling Java applications to interact with various LLM providers through a unified interface

Overview
Eval results
Files

simple-caching.mddocs/features/prompt-caching/

Simple Caching with BedrockCachePointPlacement

Automatic cache point placement for entirely static content.

Overview

BedrockCachePointPlacement automatically inserts cache points at predefined locations. Use this when all cached content is completely static.

public enum BedrockCachePointPlacement {
    AFTER_SYSTEM,        // Cache after system messages
    AFTER_USER_MESSAGE,  // Cache after first user message
    AFTER_TOOLS         // Cache after tool definitions
}

Important: Only works with standard SystemMessage. For BedrockSystemMessage, use granular caching instead.

AFTER_SYSTEM

Cache the system message content.

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_SYSTEM)
    .build();

BedrockChatModel model = BedrockChatModel.builder()
    .modelId("anthropic.claude-3-5-sonnet-20241022-v2:0")
    .defaultRequestParameters(params)
    .build();

String systemPrompt = loadLargeStaticPrompt(); // >1024 tokens

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(
        SystemMessage.from(systemPrompt),
        UserMessage.from("Question")
    )
    .build());

Use when: You have a large, static system prompt that doesn't change between requests.

AFTER_USER_MESSAGE

Cache system message plus the first user message.

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_USER_MESSAGE)
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(
        SystemMessage.from("Large system context..."),
        UserMessage.from("Initial user context...")  // Also cached
    )
    .build());

Use when: You have static system instructions plus an initial context message that's consistent across requests.

AFTER_TOOLS

Cache tool definitions.

import dev.langchain4j.agent.tool.Tool;

class Calculator {
    @Tool("Add two numbers")
    int add(int a, int b) { return a + b; }

    @Tool("Multiply two numbers")
    int multiply(int a, int b) { return a * b; }
}

BedrockChatRequestParameters params = BedrockChatRequestParameters.builder()
    .promptCaching(BedrockCachePointPlacement.AFTER_TOOLS)
    .build();

ChatResponse response = model.chat(ChatRequest.builder()
    .messages(UserMessage.from("What is 5 + 3?"))
    .toolSpecifications(toolsFrom(new Calculator()))  // Tool defs cached
    .build());

Use when: You have static tool definitions that don't change between requests.

Combining Placements

Only one placement can be active at a time. To cache multiple sections, use granular caching.

Token Usage Tracking

import dev.langchain4j.model.bedrock.BedrockTokenUsage;

// First request
ChatResponse response1 = model.chat(request);
BedrockTokenUsage usage1 = (BedrockTokenUsage) response1.tokenUsage();
System.out.println("Cache write: " + usage1.cacheWriteInputTokens());

// Second request (within 5 minutes)
ChatResponse response2 = model.chat(request);
BedrockTokenUsage usage2 = (BedrockTokenUsage) response2.tokenUsage();
System.out.println("Cache read: " + usage2.cacheReadInputTokens());

Limitations

  • Static content only: Any change invalidates the cache
  • Standard SystemMessage only: Does not work with BedrockSystemMessage
  • One cache point: Can only cache one section
  • Minimum size: Content must exceed ~1,024 tokens

Next: Granular Caching for mixed static/dynamic content

Install with Tessl CLI

npx tessl i tessl/maven-dev-langchain4j--langchain4j-bedrock@1.11.0

docs

index.md

README.md

tile.json