Koog 1.0 idioms, gotchas, and scaffolding skills for Kotlin agents on the JVM
89
89%
Does it follow best practices?
Impact
89%
1.78xAverage score across 47 eval scenarios
Advisory
Suggest reviewing before use
All notable changes to this tile are documented here. Format: Keep a Changelog. Versioning: SemVer.
snapshot-and-restore Step 1 + add-persistence Step 2 — completed the crash-recovery redirect handoff. The snapshot Step 1 redirect previously said only "invoke add-persistence", so with the skill loaded the agent emitted a meta-description of the skill chain instead of the concrete install(Persistence) solution and a developer-facing message — eval snapshot-and-restore-refuses-crash regressed to with-context 42 vs baseline 66. The redirect now directs the agent to deliver the full Persistence solution plus a one-message snapshot-vs-Persistence mismatch explanation, and add-persistence Step 2 gained checkpoint-frequency cost guidance (every-step writes are expensive on long runs). With-context returned to 100evals/migrate-from-0-x-custom-strategy — second migration scenario covering the non-obvious 1.0 breaking changes a custom-strategy agent hits: the nodeExecuteTools auto-writeback removal (chain nodeLLMSendToolResults explicitly), the LLMClient HTTP-transport decoupling (KoogHttpClient.Factory instead of a Ktor HttpClient), and the kotlin.time.Clock → KoogClock swap, alongside the coordinate/JDK bumpsevals/wire-acp-server-choose-vs-a2a — protocol-selection scenario: a tooling dashboard needing run-lifecycle control with cancellation and progress streaming should pick ACP (agents-features-acp), not A2A (agent-to-agent RPC) or MCP (tool host). Highest-lift new scenario (+95 baseline→with-context)evals/add-structured-output-classify-issue — removed the answer-narrating clause from the task so the "does not introduce a custom strategy" criterion tests application rather than readingKnowledge corrections from Vadim Briliantov (Koog project lead) — three skill clarifications:
wire-mcp-server Step 6 — added a framing paragraph stating that @Tool / @LLMDescription / ToolSet are LOCAL Koog tool annotations. The startStdioMcpServer path bridges a Koog ToolRegistry to MCP, but that's a secondary use case for the annotation, not its primary purpose. For projects whose primary goal is publishing tools over MCP (independent of any Koog agent), the Kotlin MCP SDK (io.modelcontextprotocol:kotlin-sdk) has its own server annotation. The Koog-bridge path is right when you already have a Koog ToolRegistry and want it reachable over MCP tooadd-tool Step 3 (Sub-Agent-as-Tool) — added the sub-agent vs subgraph distinction. Sub-agents (AIAgentService.fromAgent) are fully independent agents that communicate only through typed input/output; subgraphs (subgraphWithTask / subgraphWithVerification) are part of the same agent and share one message history. Default for "break my agent into stages" is subgraph; reach for sub-agent only when isolation is the explicit requirementdomain-model-subtask-pipeline Step 6 — strengthened the auto-shared-history framing to name the contrast with independent-agent abstractions (Koog sub-agents, LangChain4j Agentic sub-agents). Subgraphs live on one common history; independent agents communicate only through typed input/output. Cross-references Skill(skill: "add-tool") Step 3 for the isolation caseuse-llm-node-variants Steps 1-4 — added the Path: write directive to each action (streaming / multiple-choice / moderation / force-one-tool). Eval 019e648a against 0.4.5 surfaced that use-llm-node-variants-streaming regressed to lift -87 (baseline 87 → with-context 0) with the reasoning "No Kotlin code was produced at all". Same file-write-gap root cause as the 0.3.1 nightmare — this was the one skill that hadn't been patched in PR #10's omnibus Path: rolloutevals/persist-chat-history-refuses-fact-store/criteria.json — re-weighted to favor functional correctness over prose explanation. The 0.4.5 eval showed the agent did the right thing functionally (refused JdbcChatHistory, picked LongTermMemory) but lost 35/100 on prose-explanation criteria (30 + 5) because the patched skills explicitly direct code-only output via Path:. New weights: Does not install JdbcChatHistory 35 (was 25), Recommends LongTermMemory 30 (was 25), Does not synthesise pseudo-turns 25 (was 15), Names the distinction 10 (was 30, reworded to accept a code comment), and dropped the standalone "Acknowledges the framing" criterion (was 5). Sum still 100wire-mcp-server Step 6 — optional server-side startStdioMcpServer flow for users authoring an MCP server (not just consuming one). Pulls ai.koog:agents-mcp-server-jvm:1.0.0-beta (same beta + -jvm-suffix gotchas as the client module). Exposes a ToolRegistry over stdio with awaitCancellation() keeping the process alivedomain-model-subtask-pipeline Step 7 — MultiLLMPromptExecutor callout for cross-provider per-subgraph model selection. The skill teaches per-phase llmModel = ... but the per-provider executors (simpleOpenAIExecutor / simpleAnthropicExecutor) only know one provider; when a strategy mixes (e.g., OpenAIModels.Chat.O3 for the verifier and AnthropicModels.Sonnet_4_5 for the deployer), the agent needs a MultiLLMPromptExecutor with one client per providerpersist-chat-history Step 5 — multi-turn footnote: the same agent instance can call agent.run(input, sessionId = ...) repeatedly; the installed chat-history backend accumulates the message log on each call, so a while (true) driver loop maintains conversation without reconstructing the agentpersist-chat-history Step 6 — new anti-pattern section: don't use chat-history as a fact store. Symptoms (date-prefixed pseudo-turns, synthetic Message.Assistant claims about actions the agent never took, queryable structured data forced into a sequential channel) and the right primitives for each shape (LongTermMemory for cross-session facts; @Tool for queryable structured data; systemPrompt for small fixed context; storage for run-scoped state). Custom ChatHistoryProvider stays legitimate for replaying real conversation messages from external sourcesadd-observability Step 3 — one-line clarification on setVerbose(true) semantics: it emits prompts, completions, and token counts on each spanevals/wire-mcp-server-author-stdio — positive scenario covering the new Step 6 decisional branch (publishing an existing ToolSet as a stdio MCP server). Criteria check for startStdioMcpServer, the agents-mcp-server-jvm:1.0.0-beta dependency, ToolRegistry { tools(asTools()) } reuse of the developer's existing class, awaitCancellation() for process lifetime, and refusal of the client-transport surfaceevals/persist-chat-history-refuses-fact-store — negative scenario covering the new Step 6 anti-pattern (don't use chat-history as a fact store). Criteria check that the agent names the chat-history-vs-fact-store distinction, recommends LongTermMemory or a @Tool, refuses to install JdbcChatHistory / ChatHistoryAws / ChatMemorySql, and refuses to synthesise pseudo-turns in a custom ChatHistoryProviderHardened skills against the file-write failure mode observed in 0.3.1 eval run 019e60f5 and confirmed in the partial re-run 019e613e: the scorer reads files from the solution directory, but several skills told the agent "produce ... as part of your response" — which the agent satisfied with stdout prose that the scorer can't see. Adopted the Path: convention from scaffold-agent across the patched skills so the file targets are explicit and unambiguous (full src/main/kotlin/com/example/<file> paths plus build.gradle.kts at repo root):
add-observability Step 3 — Path: src/main/kotlin/com/example/Main.kt for the modified agent construction, Path: build.gradle.kts for the dependencymanage-state Step 2 — Path: src/main/kotlin/com/example/Strategy.kt for the boundary-node bodypersist-chat-history Step 4 — Path: src/main/kotlin/com/example/Main.kt + Path: build.gradle.kts + a concrete handler path (e.g., src/main/kotlin/com/example/Routes.kt) when the user named a handleradd-tool Step 2 — Path: src/main/kotlin/com/example/AccountLookupTool.kt (concrete tool-name example, rename to match the actual tool) + Path: src/main/kotlin/com/example/Main.kt + Path: build.gradle.ktsauthor-strategy Step 8 — Path: src/main/kotlin/com/example/Strategy.kt for the DSL + Path: src/main/kotlin/com/example/Main.kt for the modified constructionhandle-agent-events Step 2 — Path: src/main/kotlin/com/example/Main.kt + Path: build.gradle.ktswire-ktor-server Step 5 — Path: src/main/kotlin/com/example/Application.kt + Path: build.gradle.ktsadd-tool Step 1 routing — annotated-tool default now defers to Step 2 (typed Tool<TArgs,TResult>) when the user's existing function takes a data class parameter or returns a typed result; the previous "default to Step 1" rule wrapped typed signatures in flat-primitive annotated tools and lost the type contract
handle-agent-events Step 2 — adopted the same Path: directive; round-2 eval 019e6149 showed the prior prose-only handoff was non-deterministic (round 1: 100, round 2: 0)
use-planner Step 1 redirect to author-strategy — made the redirect actionable: it now runs author-strategy end-to-end and writes the graph DSL code via Path: per author-strategy's Step 8, plus topology and round-trip-cost reasoning as comments at the top of the produced file; the previous wording let the agent stop at a prose explanation. Step 1 also adds a "Finish here — do not continue into planner-variant selection or Step 2 / Step 3" line so the redirect and planner-fits branches are mutually exclusive, and an explicit "Chaining exception (exhaustive — overrides 'Do not run other steps' only as listed)" preamble names the Step 1 → Step 2/3 chain per skill-authoring
Hardened skills against the "no-output" failure mode observed in 0.3.1 eval run 019e60f5:
add-observability Step 2 — replaced blocking "Ask the user which backend" with a non-blocking pick + OTLP default; Step 3 writes via Path:; fixes the −100pp lift on add-observability-langfusewire-ktor-server Step 2 — split into minimal install (mandatory) vs MCP/HOCON add-ons; Steps 3 and 4 each open with "Skip this step entirely if..."; Step 5 writes via Path:; fixes the −28pp lift on wire-ktor-server-routemanage-state Step 2 — committed to HistoryCompressionStrategy.WholeHistory as the default and moved the other six variants into "use only when the user names them"; fixes the −80pp lift on manage-state-tldr-mid-phaseuse-planner Step 1 — converted the "ask user LLM-based vs GOAP" stall into a pick-by-keywords rule (GOAP only when the user names typed state / classical planner / state space); planner-redirect tasks no longer block on a clarifying questionTightened skill activation routing for planner construction:
use-planner description now names Planners.llmBased, Planners.llmBasedWithCritic, Planners.goap, PlannerAIAgent, agents-plannerscaffold-agent description adds an explicit "Do NOT use when the user is constructing a planner / picking a strategy / naming a specific agent shape" exclusion; fixes the mis-activation that pulled scaffold-agent for use-planner-llm-based-triagecache-llm-calls-redis-shared eval scenario — retired per plugin-evals.md "Lift, Not Attainment": baseline 100/100, lift 0pp (Cause #1, universal competence). The partner scenario cache-llm-calls-refuses-provider-side still covers cache-llm-calls at +100pp lifthandle-agent-events-stdout-trace task — stripped the "arrow indicating start vs end" framing that bled into the criterion "Uses distinct visual markers for start vs end" (plugin-evals.md "No Bleeding")add-tool-typed-args-with-result task — stripped "they want the tool's input and output to remain these typed shapes — not a JSON blob, not a flattened String" framing that telegraphed Tool<TArgs,TResult>; the typed queryAccount signature still carries the constraintpersist-chat-history-jdbc task — replaced "wants conversations to persist by user account" framing with a user-reported bug ("the bot doesn't remember anything we talked about") so the agent must navigate persistence-feature vs chat-history vs LongTermMemory on its owndescription: fields in add-tool (Tool<TArgs,TResult> → Tool[TArgs,TResult]; <X> / <function> → prose), domain-model-subtask-pipeline (subgraphWithTask<In, Out> → subgraphWithTask[In, Out]; same for subgraphWithVerification<T> and CriticResult<T>), use-llm-node-variants (<tool> → prose). The tessl skill review validator rejects < followed by alpha as an XML tag — the just-landed skill-review CI gate (0.4.1) failed add-tool on the previous 0.4.2 publish attempt because of this. The 0.4.2 publish never landed in the registry (failed at the gate); 0.4.3 ships the same content as 0.4.2 plus this descriptor fixauthor-strategy skill — added a member-vs-extension import table at the end of Step 5. The DSL primitives split across two shapes: forwardTo / onCondition / transformed are infix members (no import needed); onToolCalls / onTextMessage / onIsInstance / onSuccessful / onFailure / asUserMessage / asToolResultMessage / onMessageParts are top-level extensions in ai.koog.agents.core.dsl.extension.* (each needs its own import). Inventing a member import or omitting an extension import is the most common copy-paste compile failureauthor-strategy + domain-model-subtask-pipeline skills — fixed the wrong artifact text. subgraphWithTask / subgraphWithVerification / CriticResult (package ai.koog.agents.ext.agent) ship inside agents-core (which koog-agents umbrella pulls), NOT the standalone ai.koog:agents-ext:1.0.0-beta artifact. Added the imports + the AIAgentGraphStrategy package note (ai.koog.agents.core.agent.entity, not bare ai.koog.agents.core.agent)wire-mcp-server skill — each transport builder (streamableHttp, fromSseUrl, fromProcess) is a top-level extension in ai.koog.agents.mcp declared separately from the McpToolRegistryProvider object. All three example blocks now show the explicit extension import alongside the provider importwire-mcp-server skill + module-coordinates rule — corrected the MCP client dependency to ai.koog:agents-mcp-jvm:1.0.0-beta. Koog 1.0 stable did not publish agents-mcp / agents-mcp-server at 1.0.0; only 1.0.0-beta is on Maven Central, and they publish only JVM variants so the -jvm suffix is mandatory for Gradle KMP variant resolutionadd-tool + wire-a2a skills — corrected the ToolSet and asTools imports to come from ai.koog.agents.core.tools.reflect.* (the actual package), not bare ai.koog.agents.core.tools.*module-coordinates rule — added Kotlin 2.3.10+ minimum requirement. Koog 1.0 is compiled with Kotlin 2.3.x; earlier Kotlin versions fail at consume time with metadata-version errorsevals/domain-model-subtask-pipeline-triage/criteria.json — corrected C7's required-Gradle-deps description: the umbrella ai.koog:koog-agents:1.0.0 is sufficient (it pulls agents-core, which contains the subgraph DSL). Penalize unnecessary agents-ext linesevals/author-strategy-import-shapes — new positive scenario testing the member-vs-extension import correctness when the agent emits a tool-handling-loop strategyevals/wire-mcp-server-import-shapes — new positive scenario testing the fromSseUrl extension import + the -jvm:1.0.0-beta dependency line. Updated wire-mcp-server-stdio-playwright, wire-mcp-server-streamable-http, and wire-mcp-server-merge-tools criteria to match the corrected artifact specCloses #9 (items 1–9; item 10 is a separate Tessl install-policy investigation).
.github/workflows/publish.yml — tessl skill review --threshold 85 gate before tessl tile publish . via jbaruch/coding-policy/.github/actions/skill-review@ef67ffe5 (changed-skills loop). Closes the context-artifacts Mandatory Review gap flagged on #7; below-threshold skill scores now block publish. Checkout step bumped to fetch-depth: 0 so the action's git diff $github.event.before..HEAD can resolve the prior commit. Closes #8domain-model-subtask-pipeline skill — the integrated pattern for typed-handoff pipelines: tools sliced by access into separate ToolSets (read / write / communication), @Serializable @LLMDescription-annotated data classes as inter-subtask contracts, subgraphWithTask<In, Out> per phase with per-phase model selection, subgraphWithVerification<T> + CriticResult<T> for self-correction loops. The methodology JetBrains' KotlinConf 2026 banking demo demonstrates — fills the gap left by author-strategy (DSL mechanics only) and add-structured-output (top-level typed output only)domain-model-subtask-pipeline-triage (positive, four-phase support workflow) and domain-model-subtask-pipeline-refuse (negative — declines to over-engineer a one-shot text transform).github/workflows/publish.yml — tesslio/patch-version-publish@v1 wired to push-on-main + manual dispatch; auto-bumps patch from the registry on future merges.github/workflows/review-openai.md / review-anthropic.md — paired gh-aw PR reviewers from jbaruch/coding-policy: install-reviewer (cross-family review per author-model-declaration).env.example — required hosted-CI secrets with placeholder values and a deep link to https://github.com/jbaruch/koog-tessl/settings/secrets/actions (per no-secrets).pre-commit-config.yaml — gitleaks v8.21.2 + standard pre-commit-hooks (per no-secrets pre-commit-scanning requirement)use-planner-refuses-when-graph-fits, cache-llm-calls-refuses-provider-side, add-persistence-refuses-conversation, snapshot-and-restore-refuses-crash — covering the cross-skill redirects each skill prescribes (closes the "only positive cases" gap from 0.3.0)use-attachments-pdf-and-url (PDF + URL-image attachments in a single LLM call); the existing use-attachments-image-input only exercised imagesscenario.json backfilled on every eval scenario (40 total) to match the canonical three-file shape tessl scenario generate emits — drift fix, not a featuretile.json summary from a 290-char comma-spliced multi-clause string back to a one-line description per skill-authoring.mdinteraction-rules phantom reference from 8 skill files (the rule never existed in this tile or in any consumed tile)see the \X` skill, redirect to `X`) to typed Skill(skill: "X")calls perskill-authoring.md` "Typed Calls"Verify and Hand Off, Bump JDK and Tooling, etc.) per skill-authoring.md "Step Structure"model-planner-subtasks-parallel-tree task to remove technique leak (PlannerNode composition, storage-key tracking) per plugin-evals.md "No Bleeding"use-attachments-pdf-and-url task on the cross-family reviewer's finding — dropped strategy / user message technique proseenable-prompt-caching-anthropic-long-system (43 chars) → enable-prompt-caching-anthropic (31) to fit the 40-char default capwire-mcp-{merge-with-existing-tools,stdio-playwright,streamable-http-github} → wire-mcp-server-{merge-tools,stdio-playwright,streamable-http} so prefixes match the skill name per plugin-evals.md "Naming"cache-llm-calls — in-process LLM-response cache (prompt-executor-cached + prompt-cache-{files,model,redis}), distinct from the provider-side caching covered by enable-prompt-cachingpersist-chat-history — chat-history persistence backends (chat-history-jdbc, chat-history-aws, chat-memory-sql), distinct from generic persistence and LongTermMemorytest-koog-agents — deterministic agent testing with agents-test (scripted executor, fake KoogClock, event-handler recorder)trace-agent-internals — deep diagnostic trace feature (agents-features-trace), distinct from OpenTelemetry (production signal) and event handlers (high-level callbacks)query-sql-from-agent — SQL-querying feature (agents-features-sql) with read-only mode, schema scoping, row capsmodel-planner-subtasks — PlannerNode tree composition, parallel vs sequential subtasks, retry-on-parse-failure edges, history compression between phasesuse-functional-agent — FunctionalAIAgent (the third concrete agent subtype, alongside GraphAIAgent and PlannerAIAgent) — single suspending block, no graphmodule-coordinates and agent-construction remain always-on — they cover gotchas every Koog project hits. The other 7 rules were converted to on-demand skills:
strategy-dsl → author-strategy skillplanner-vs-graph → use-planner skilltools-and-mcp → folded into add-tool and wire-mcp-server skills (already existed)state-and-memory → manage-state skillobservability → add-observability skillspring-boot-integration → wire-spring-boot skillmigration-from-0-x → migrate-from-0-x skilladd-structured-output, define-prompt, add-persistenceenable-prompt-caching, handle-agent-events, wire-ktor-serveruse-llm-node-variants (streaming / multiple-choices / moderation / force-one-tool), add-rag, wire-a2a, wire-acp-server, add-token-budgeting, snapshot-and-restore, use-attachmentsagent-construction rule now includes a "When to reach for a skill" index pointing to the right skill for each common taskscaffold-agent, add-tool, wire-mcp-server.gemini
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
scenario-31
scenario-32
scenario-33
scenario-34
scenario-35
scenario-36
scenario-37
scenario-38
scenario-39
scenario-40
scenario-41
scenario-42
scenario-43
scenario-44
scenario-45
scenario-46
scenario-47
skills
add-observability
add-persistence
add-rag
add-structured-output
add-token-budgeting
add-tool
cache-llm-calls
define-prompt
domain-model-subtask-pipeline
references
enable-prompt-caching
handle-agent-events
manage-state
migrate-from-0-x
model-planner-subtasks
persist-chat-history
query-sql-from-agent
scaffold-agent
snapshot-and-restore
test-koog-agents
trace-agent-internals
use-attachments
use-functional-agent
use-llm-node-variants
use-planner
wire-a2a
wire-acp-server
wire-ktor-server
wire-mcp-server
wire-spring-boot