Use when a debugging thread needs to be compressed into a reusable investigation ledger. Capture the target, evidence, attempted fixes, ruled-out hypotheses, viable hypotheses, and next experiments. Good triggers include "compact this debugging session", "summarize what we've tried", and "turn this into a debugging ledger".
99
100%
Does it follow best practices?
Impact
99%
3.66xAverage score across 8 eval scenarios
Passed
No known issues
A backend team has been chasing a memory leak in a Node.js service for two days. The Slack thread is now 200+ messages long and unmanageable. You've been asked to produce a compact version of the investigation so that the on-call engineer coming on shift tonight can pick up exactly where the team left off.
Read the session transcript below and produce a compact investigation record. Save it as investigation_summary.md.
The following file is provided as input. Extract it before beginning.
=============== FILE: inputs/slack_thread.md ===============
Day 1, 10:00 — Issue filed: Node.js API service memory grows from ~200MB to ~1.4GB over 6 hours, then OOM-kills. Restarts every night around 3am.
Day 1, 10:45 — Checked if it was a known library issue. Scanned npm audit. No relevant advisories found. Dead end.
Day 1, 11:30 — Added heap snapshot collection every 30 min via clinic.js. Heap snapshots confirmed: string objects accumulating in a cache layer.
Day 1, 12:15 — Hypothesis: Express session middleware not expiring sessions. Checked session TTL config — TTL is set to 1 hour and working correctly. Hypothesis disproved.
Day 1, 14:00 — Hypothesis: Log buffer not being flushed. Added flush call after each request. Memory growth rate unchanged after 1 hour observation. This fix had no effect.
Day 1, 15:30 — Added per-endpoint memory tracking via custom middleware. Isolated the leak to the /search endpoint.
Day 1, 17:00 — Examined /search handler code. Found an in-process cache (searchCache) that stores raw query strings and result sets, keyed by query. Cache has no eviction policy and no size limit.
Day 1, 17:45 — Added LRU cache with max 500 entries to replace unbounded cache. Deployed to staging. Memory stable for 2 hours on staging. Looks promising.
Day 2, 09:00 — Deployed LRU fix to production. Monitoring overnight: memory still grew but much more slowly — from 200MB to ~450MB over 6 hours (was 1.4GB before).
Day 2, 10:30 — Heap snapshots show a second, smaller accumulation: event listeners on the EventEmitter instance inside the search handler are not being removed after request completion. Classic listener leak.
Day 2, 11:15 — Added emitter.removeAllListeners() in the request cleanup path. Deployed to staging. Memory stayed flat for 3 hours on staging.
Day 2, 12:00 — Hypothesis: Third-party elastic-client library leaks connections. Checked connection pool metrics — pool is healthy, no leaked connections detected. Hypothesis disproved.
Day 2, 13:30 — Not yet deployed the listener fix to production. Need to verify it handles edge case where request is aborted mid-flight before merging.
Day 2, 14:00 — Still need to check: does the EventEmitter leak also exist in the /autocomplete endpoint which uses similar code?
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
compact-debug-ledger