coding-agent-helpers/compact-debug-ledger

Use when a debugging thread needs to be compressed into a reusable investigation ledger. Capture the target, evidence, attempted fixes, ruled-out hypotheses, viable hypotheses, and next experiments. Good triggers include "compact this debugging session", "summarize what we've tried", and "turn this into a debugging ledger".

3.66x

Quality

100%

Does it follow best practices?

Impact

99%

3.66x

Average score across 8 eval scenarios

Securityby

Passed

No known issues

Debugging Session Summary

Name: coding-agent-helpers/compact-debug-ledger
Rating: 99.3 (1 reviews)
Author: coding-agent-helpers

Problem Description

Your team has been investigating a production incident for the past 3 hours. The conversation has grown long and messy — people have gone back and forth, explored dead ends, and made progress that's now buried in chat history. A new engineer is joining the investigation and needs to get up to speed quickly without reading the entire conversation.

Produce a concise investigation summary from the transcript below that captures where things stand. Save the summary to a file called debug_ledger.md.

Input Files

The following file is provided as input. Extract it before beginning.

=============== FILE: inputs/session.md ===============

Debug Session Transcript

[09:02] Alice: Hey team, we're getting a flood of 500 errors on the /checkout endpoint starting around 8:55am. Error rate jumped from <0.1% to about 12%.

[09:04] Bob: Let me check the logs. Yeah I see a lot of "connection pool exhausted" errors in the app logs.

[09:05] Alice: Could be the database. Did we deploy anything this morning?

[09:06] Bob: Yes, we deployed v2.3.1 at 8:50am. That release only changed the product recommendation engine though, shouldn't touch checkout.

[09:08] Alice: Let me check if it's the DB. I'll query the slow query log. Wait, actually let me grab coffee first.

[09:12] Bob: While you were gone I checked — DB CPU is at 22%, totally normal. Query latency is also fine, p99 is 45ms which is baseline.

[09:14] Alice: Oh interesting. So not the DB itself. Could be connection pool misconfiguration in the new deploy?

[09:15] Carol: I joined late, what's the issue?

[09:16] Alice: 500s on checkout since 8:55, connection pool exhausted errors.

[09:17] Carol: Have you checked thread pool? The new recommendation engine runs async workers.

[09:18] Bob: Good call. I see the recommendation service is spawning workers but not releasing them. The thread pool is at 98% capacity.

[09:20] Alice: Is it related to the deploy? Was this worker leak there before?

[09:22] Bob: I checked git blame. The async worker lifecycle code was changed in v2.3.1. There's a code path where if the recommendation API times out, the worker goroutine is never cleaned up.

[09:23] Carol: So the recommendation API is timing out? Let me check its latency... yeah it's showing p99 of 12 seconds, way above the 2 second timeout.

[09:25] Alice: So the recommendation API is slow → causes timeout → worker goroutines leak → thread pool exhausts → checkout requests fail. That's the chain.

[09:26] Bob: Makes sense. We could fix by: (a) rolling back v2.3.1, (b) patching the goroutine leak with a defer cleanup, or (c) increasing the thread pool size as a bandaid.

[09:27] Carol: Why is the recommendation API slow though? That seems like root cause.

[09:28] Alice: Recommendation API queries a Redis cache that may have been flushed. Let me check... yes, Redis was restarted at 8:53am for a routine maintenance window. Caused cache miss storm.

[09:30] Bob: So the Redis restart triggered the slowness, but the goroutine leak in v2.3.1 is what turned a slow API into total checkout failure.

[09:31] Carol: Redis should be fully warmed up again in a few minutes based on traffic patterns. But we still have the goroutine leak.

[09:32] Alice: Agreed. Immediate options: rollback v2.3.1 or deploy a hotfix. What's the recommended fix?

[09:33] Bob: I'm writing the hotfix now — adding a defer statement to clean up goroutines on the timeout path.

[09:35] Carol: Should be a one-line fix. I'll review it. Also, should we increase alerting on thread pool saturation? This would've caught it faster.

[09:37] Alice: Yes but let's fix first. Is the hotfix ready?

[09:38] Bob: Almost. Testing it locally now.

evals

scenario-1

criteria.json

task.md

scenario-2

scenario-3

scenario-4

scenario-5

scenario-6

scenario-7

scenario-8

skills

tile.json

coding-agent-helpers/compact-debug-ledger

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-1/

Debugging Session Summary

Problem Description

Input Files

Debug Session Transcript

task.mdevals/scenario-1/