CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/troubleshoot

Search-first troubleshooting with a diagnostic phase — use when an error, bug, or unexpected behaviour is reported.

75

Quality

75%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

reference.mdreferences/

Troubleshoot Reference

1. Isolation Techniques

Wolf Fence (Binary Search)

Divide and conquer. Cut search space in half each step.

StepAction
1Comment out / disable half the code
2Bug gone? → It's in removed half. Bug persists? → Other half
3Repeat on guilty half until single line/block found

Git variant: git bisect startgit bisect badgit bisect good <commit> → Git finds guilty commit in O(log n)

Swap One Variable

When multiple suspects, change only one at a time:

  • Works in env A but not B? → Diff A and B
  • Works with data X but not Y? → Diff X and Y
  • Works before commit Z? → Diff before/after Z

Minimal Reproduction

Reduce to smallest case that still fails:

  1. Remove code/config until bug disappears
  2. Add back last removed piece
  3. That's your culprit

2. Root Cause Drilling

5 Whys

Keep asking "why?" until you hit root cause (usually 3-5 levels):

Problem: API returns 500
→ Why? Unhandled exception
→ Why? Null pointer on user.email
→ Why? User record has no email
→ Why? Migration didn't backfill
→ Why? Script assumed all users have email ← ROOT CAUSE

Rule: Stop when answer is actionable fix.

Fishbone (6 M's)

For complex bugs with unclear cause, brainstorm across categories:

CategoryQuestions
ManWho touched it last? Training gap? Handoff issue?
MachineHardware? Memory? CPU? Disk? Network?
MethodProcess flaw? Missing step? Wrong order?
MaterialBad input data? Corrupt file? Wrong format?
MeasurementMissing logs? Wrong metric? Alert gap?
MilieuEnvironment diff? Config drift? External dep?

Pick 2-3 most likely categories, investigate those first.

3. Assumption Surfacing

Rubber Duck

Explain the problem out loud, line by line:

  1. "This function should do X..."
  2. "First it gets Y from Z..."
  3. "Then it— wait, that's not what it does"

Why it works: Forces you to articulate assumptions. Bug often surfaces mid-explanation.

Verify → Shorten → Reveal

PhaseAction
VerifyDoes bug actually exist as reported? Reproduce first.
ShortenMake feedback loop fast (seconds not minutes). Write a test.
RevealAdd logging at boundaries. Test each assumption explicitly.

4. Investigation Loop (OODA)

When isolation + drilling doesn't resolve:

OBSERVE → ORIENT → DECIDE → ACT → repeat
PhaseActionExample
ObserveGather data"Run X and paste output"
OrientForm hypothesis"This suggests Y because..."
DecidePick next action"Let's verify by checking Z"
ActUser executes"Run this command: ..."

Exit when: Root cause confirmed + fix verified

Stuck? Go back to Search (step 1 in SKILL.md) with new keywords from investigation.

references

reference.md

troubleshoot-to-frame-problem-llm.md

SKILL.md

tile.json