CtrlK
BlogDocsLog inGet started
Tessl Logo

coding-agent-helpers/compact-debug-ledger

Use when a debugging thread needs to be compressed into a reusable investigation ledger. Capture the target, evidence, attempted fixes, ruled-out hypotheses, viable hypotheses, and next experiments. Good triggers include "compact this debugging session", "summarize what we've tried", and "turn this into a debugging ledger".

99

3.66x
Quality

100%

Does it follow best practices?

Impact

99%

3.66x

Average score across 8 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-8/

Compressing a Code-Heavy Debugging Session

Problem Description

A backend engineer has been debugging a tricky serialization bug in a Rust service. The investigation notes contain extensive code snippets, stack traces, and implementation details from exploratory dead ends. You need to turn these notes into a compact debugging ledger that a colleague can pick up and immediately run the next experiment — without wading through all the code history.

Produce a compact investigation record from the notes below and save it as rust_debug.md.

Input Files

The following file is provided as input. Extract it before beginning.

=============== FILE: inputs/rust_notes.md ===============

Rust Serialization Bug - Investigation Notes

Problem

Intermittent panic in production: called Result::unwrap() on an Err value: Error("missing field 'user_id'", line: 1, column: 847) in the deserialize_event function.

Affects roughly 1 in 10,000 events. Only in production, never in staging or local dev.

Attempt 1: Check if it's a schema mismatch

Added debug logging to dump the raw JSON before deserialization. Captured a failing payload.

The raw JSON looked like this (abbreviated — actual payload was 2.8KB):

{
  "event_type": "purchase",
  "timestamp": "2024-04-01T10:22:33Z",
  "metadata": {
    "session_id": "abc123",
    "client_version": "3.2.1",
    ... (200 fields of metadata)
  },
  "items": [
    {"sku": "PROD-001", "quantity": 2, "price": 29.99},
    ... (47 items)
  ]
}

Note: user_id is completely absent in this payload. So the payload itself is missing the field, it's not a parsing issue.

Result: confirmed — some events arrive without user_id. This is the direct cause of the panic.

Tried to understand why by checking the event source. The event producer is a Go service. Checked the Go service code:

type PurchaseEvent struct {
    EventType string    `json:"event_type"`
    Timestamp time.Time `json:"timestamp"`
    UserID    *string   `json:"user_id,omitempty"`  // <-- omitempty!
    Metadata  map[string]interface{} `json:"metadata"`
    Items     []Item    `json:"items"`
}

The Go struct uses omitempty on UserID, and UserID is a pointer — so if the user is a guest (not logged in), UserID is nil and gets omitted from the JSON. This is legitimate behavior for guest checkout events.

Attempt 2: Check why the Rust deserializer panics on missing optional field

The Rust struct was:

#[derive(Deserialize)]
struct PurchaseEvent {
    event_type: String,
    timestamp: DateTime<Utc>,
    user_id: String,  // <-- NOT Option<String>!
    metadata: HashMap<String, Value>,
    items: Vec<Item>,
}

Confirmed: user_id is String not Option<String> in the Rust struct. Serde requires the field to be present unless it's Option<T> or has a #[serde(default)] attribute.

Fixed by changing to:

user_id: Option<String>,

Result: worked — no more panics in staging after the fix. But wait...

Attempt 3: Check downstream code that uses user_id

After changing to Option<String>, compiler errors appeared in 14 places in the codebase that called .user_id methods directly expecting a String.

Had to update all 14 call sites. This took 2 hours. Changes included:

  • analytics_service.rs line 234: event.user_idevent.user_id.as_deref().unwrap_or("guest")
  • billing_service.rs lines 89, 112, 156: similar pattern
  • audit_log.rs lines 33, 67, 99, 145, 178, 203, 211, 288, 312: used event.user_id.as_deref().unwrap_or_default()
  • user_activity.rs lines 445, 501: used if let Some(uid) = &event.user_id

All 14 changes were made. Full test suite passes. Changes are in PR #4821.

Attempt 4: Validate the fix handles edge cases

Wrote a property-based test using proptest:

proptest! {
    #[test]
    fn test_deserialize_with_missing_user_id(
        event_type in "[a-z]{5,20}",
        has_user_id in proptest::bool::ANY,
    ) {
        let json = if has_user_id {
            format!(r#"{{"event_type":"{}","timestamp":"2024-01-01T00:00:00Z","user_id":"user123","metadata":{{}},"items":[]}}"#, event_type)
        } else {
            format!(r#"{{"event_type":"{}","timestamp":"2024-01-01T00:00:00Z","metadata":{{}},"items":[]}}"#, event_type)
        };
        let result: Result<PurchaseEvent, _> = serde_json::from_str(&json);
        prop_assert!(result.is_ok());
    }
}

Test passes. 10,000 random cases, no panics.

Open Questions

  • Are there other event types (not just PurchaseEvent) that have the same omitempty/Option mismatch problem?
  • Should we add a lint rule or schema validation layer to catch this class of bug at the producer-consumer boundary?

Status

PR #4821 ready for review. Root cause confirmed and fixed. Two open questions above remain.

evals

tile.json