CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/llm-learning-system-auditor

Use when the user wants to review, audit, or check safety for an AI memory system, agent learning pipeline, prompt-tuning workflow, skill builder, trace-mining tool, or eval/feedback loop. Produces an evidence-led audit report with learning-loop map, evidence inventory, maturity scorecard, severity-ranked findings, privacy/provenance gaps, counterfactual/eval coverage, and Stabilize/Standardize/Scale roadmap.

100

1.28x
Quality

100%

Does it follow best practices?

Impact

100%

1.28x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-1/

Audit Report: MemoryHarvester Rule and Memory Learning Loop

Background

The platform team built MemoryHarvester to turn agent session history into durable memories and global operating rules. It has been running for a month on internal support and engineering-agent traffic.

The team wants an audit before enabling MemoryHarvester for customer-facing agents. They provided only the files in inputs/. You do not have live dashboards, raw session samples, annotation queues, production incidents, or access-control policies.

Output Specification

Produce a single file called audit_report.md with a complete audit of MemoryHarvester.

Your report should:

  • Map the learning loop from session event to promoted memory or rule
  • Score relevant maturity areas with the 0-4 scale
  • Identify evidence-backed risks and gaps with file and line references
  • Prioritize fixes before customer-facing rollout

Where evidence is missing, label it as a gap rather than assuming the system is safe.

evals

scenario-1

criteria.json

task.md

README.md

tile.json