Name: markusdowne/memory-roundtrip-guard
Rating: 0.928 (1 reviews)
Author: markusdowne

markusdowne/memory-roundtrip-guard

Tests memory writes, confirms read-back accuracy, and validates retrieval success to ensure saved information can actually be recovered. Use when you need to verify memory was saved correctly, check if stored data can be retrieved, confirm a memory entry is discoverable, or escalate when saved information appears lost or corrupted. Covers write confirmation, read-back comparison, retrieval smoke testing, and failure escalation. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

1.19x

Quality

90%

Does it follow best practices?

Impact

97%

1.19x

Average score across 5 eval scenarios

{
  "context": "Tests whether the agent runs a retrieval smoke test after saving entries (querying by key phrase and confirming discoverability), reads back entries before the smoke test, classifies retrieval failures rather than suppressing them, and escalates repeated failures.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Read-back before smoke test",
      "description": "The code reads back each entry after writing before running the retrieval/search step",
      "max_score": 10
    },
    {
      "name": "Retrieval query performed",
      "description": "A search or query operation is performed using a key phrase from each saved entry (not just a direct key lookup)",
      "max_score": 18
    },
    {
      "name": "Discoverability asserted",
      "description": "The code explicitly checks that the saved entry appears in the search/query results (not just that results are non-empty)",
      "max_score": 18
    },
    {
      "name": "Retrieval status reported",
      "description": "The output includes a retrieval status for each entry (e.g., 'discoverable', 'retrieval miss')",
      "max_score": 10
    },
    {
      "name": "Retrieval miss classified",
      "description": "When an entry is not discoverable, the outcome is classified (e.g., as 'operational') rather than silently skipped",
      "max_score": 15
    },
    {
      "name": "Repeated failure escalated",
      "description": "When multiple entries fail retrieval (or a repeated failure pattern is detected), the classification escalates to 'critical' or the output explicitly escalates the issue",
      "max_score": 15
    },
    {
      "name": "Classification in output",
      "description": "Each entry or the overall report includes a classification label (clean/operational/critical or equivalent)",
      "max_score": 8
    },
    {
      "name": "Remediation action present",
      "description": "The output includes a recommended action or escalation step for any failed checks",
      "max_score": 6
    }
  ]
}

Install with Tessl CLI

npx tessl i markusdowne/memory-roundtrip-guard

evals

scenario-1

scenario-2

scenario-3

scenario-4

scenario-5

markusdowne/memory-roundtrip-guard

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-4/

rubric.jsonevals/scenario-4/