Name: markusdowne/memory-roundtrip-guard
Rating: 0.928 (1 reviews)
Author: markusdowne

markusdowne/memory-roundtrip-guard

Tests memory writes, confirms read-back accuracy, and validates retrieval success to ensure saved information can actually be recovered. Use when you need to verify memory was saved correctly, check if stored data can be retrieved, confirm a memory entry is discoverable, or escalate when saved information appears lost or corrupted. Covers write confirmation, read-back comparison, retrieval smoke testing, and failure escalation. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

1.19x

Quality

90%

Does it follow best practices?

Impact

97%

1.19x

Average score across 5 eval scenarios

{
  "context": "Tests whether the agent correctly classifies memory verification failures by severity (clean/operational/critical), escalates repeated or suspected data-loss failures rather than suppressing them, records unresolved failures in a daily digest, and includes all required output sections.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Clean classification",
      "description": "Entries where all checks pass are classified as 'clean' (or equivalent passing label)",
      "max_score": 8
    },
    {
      "name": "Operational classification",
      "description": "Entries with a single write/read mismatch or a single retrieval miss are classified as 'operational' (or equivalent single-failure label)",
      "max_score": 12
    },
    {
      "name": "Critical classification",
      "description": "Entries with repeated mismatches or suspected data loss are classified as 'critical' (or equivalent high-severity label)",
      "max_score": 15
    },
    {
      "name": "Repeated failure not suppressed",
      "description": "When multiple verification cycles fail for the same or different entries, the failures are explicitly surfaced in the output — not silently counted or omitted",
      "max_score": 15
    },
    {
      "name": "Daily digest section",
      "description": "The output includes a section or field explicitly labelled as a digest, summary, or daily report that lists unresolved failures",
      "max_score": 15
    },
    {
      "name": "Unresolved failures in digest",
      "description": "The daily digest section contains the entries that were classified as 'operational' or 'critical' and remain unresolved",
      "max_score": 12
    },
    {
      "name": "Remediation/escalation per entry",
      "description": "Each entry with a non-clean classification includes a recommended remediation or escalation action",
      "max_score": 12
    },
    {
      "name": "Write status reported",
      "description": "The output includes a write status for each verification entry",
      "max_score": 5
    },
    {
      "name": "Read-back status reported",
      "description": "The output includes a read-back status for each verification entry",
      "max_score": 3
    },
    {
      "name": "Retrieval status reported",
      "description": "The output includes a retrieval status for each verification entry",
      "max_score": 3
    }
  ]
}

Install with Tessl CLI

npx tessl i markusdowne/memory-roundtrip-guard@0.1.2

evals

scenario-1

scenario-2

scenario-3

scenario-4

scenario-5

markusdowne/memory-roundtrip-guard

rubric.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-5/

rubric.jsonevals/scenario-5/