Koog 1.0 idioms, gotchas, and scaffolding skills for Kotlin agents on the JVM
87
88%
Does it follow best practices?
Impact
87%
1.85xAverage score across 45 eval scenarios
Advisory
Suggest reviewing before use
Process steps in order. Do not skip ahead.
RAG in Koog spans three concerns — each is its own module:
implementation("ai.koog:embeddings-base:1.0.0")
implementation("ai.koog:embeddings-llm:1.0.0") // LLM-backed embeddings (OpenAI/Google/etc.)
implementation("ai.koog:rag-base:1.0.0")
implementation("ai.koog:rag-vector:1.0.0") // vector store + similarity searchReach past these only for backend-specific vector stores (Pinecone, Qdrant, etc.) — those usually ship as separate rag-vector-<backend> artifacts when supported.
Proceed immediately to Step 2.
LLM-backed embeddings via embeddings-llm use the same provider you'd use for completions:
import ai.koog.embeddings.llm.OpenAIEmbedder
import ai.koog.prompt.executor.clients.openai.OpenAIModels
val embedder = OpenAIEmbedder(
apiKey = System.getenv("OPENAI_API_KEY"),
model = OpenAIModels.Embedding.TextEmbedding3Small,
)Cheaper local embedders exist for self-hosted setups — check the embeddings-* module list for what 1.0 ships beyond LLM-backed (Ollama-based embedders, in-process models).
Proceed immediately to Step 3.
import ai.koog.rag.vector.InMemoryVectorStore
val store = InMemoryVectorStore(embedder = embedder)
// Index documents
documents.forEach { doc ->
store.add(id = doc.id, text = doc.text, metadata = mapOf("source" to doc.path))
}InMemoryVectorStore is fine for demos and small corpora. For production, swap in a persistent backend — JDBC, Pinecone, Qdrant, Weaviate — via the corresponding rag-vector-* module if shipped, or implement the VectorStore interface against your store of choice.
Proceed immediately to Step 4.
Two integration patterns:
(a) RAG as a tool — the LLM decides when to retrieve. Wrap the store in a tool:
@LLMDescription("Search the documentation corpus")
class DocsSearchTool(private val store: VectorStore) : ToolSet {
@Tool
@LLMDescription("Search for documents relevant to the query and return the top matches")
suspend fun search(@LLMDescription("Search query") query: String): String {
val results = store.search(query, topK = 5)
return results.joinToString("\n---\n") { "${it.metadata["source"]}: ${it.text}" }
}
}Register in the agent's ToolRegistry. Use this pattern when retrieval is optional — some queries need it, some don't.
(b) RAG as a prompt augmenter — retrieval happens automatically before every LLM call. Use a UserPromptAugmenter (see define-prompt) that calls store.search(currentInput) and appends results to the user message. Use this pattern when retrieval is always-on for every input.
Proceed immediately to Step 5.
topK — number of documents returned. Default 5 is reasonable; raise for broader context, lower for tighter promptsstore.search and the LLM prompt. Koog doesn't ship a built-in reranker; call a cross-encoder model or a second LLM with a scoring promptFinish here.
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
scenario-31
scenario-32
scenario-33
scenario-34
scenario-35
scenario-36
scenario-37
scenario-38
scenario-39
scenario-40
scenario-41
scenario-42
scenario-43
scenario-44
scenario-45
skills
add-observability
add-persistence
add-rag
add-structured-output
add-token-budgeting
add-tool
cache-llm-calls
define-prompt
domain-model-subtask-pipeline
references
enable-prompt-caching
handle-agent-events
manage-state
migrate-from-0-x
model-planner-subtasks
persist-chat-history
query-sql-from-agent
scaffold-agent
snapshot-and-restore
test-koog-agents
trace-agent-internals
use-attachments
use-functional-agent
use-llm-node-variants
use-planner
wire-a2a
wire-acp-server
wire-ktor-server
wire-mcp-server
wire-spring-boot