Use when the user wants to review, audit, or check safety for an AI memory system, agent learning pipeline, prompt-tuning workflow, skill builder, trace-mining tool, or eval/feedback loop. Produces an evidence-led audit report with learning-loop map, evidence inventory, maturity scorecard, severity-ranked findings, privacy/provenance gaps, counterfactual/eval coverage, and Stabilize/Standardize/Scale roadmap.
100
100%
Does it follow best practices?
Impact
100%
1.28xAverage score across 3 eval scenarios
Passed
No known issues
Memory extraction, rule induction, provenance, and rollout audit for session-history learning
Learning-loop map
90%
100%
0-4 maturity scorecard
100%
100%
Trace provenance and clean-evidence gap
50%
100%
Memory governance finding
83%
100%
Rule induction guardrails
58%
100%
Review, promotion, and rollback
75%
100%
Privacy and retention risk
80%
100%
Counterfactual eval coverage
50%
100%
Severity-ranked findings
62%
100%
Sequenced roadmap
100%
100%
Generated executable skill verification, provenance, sandboxing, and registry promotion audit
Lifecycle and maturity map
100%
100%
Executable sandbox risk
93%
100%
Provenance metadata
83%
100%
Registry is not verification
90%
100%
Eval gate coverage
76%
100%
Human review and rollback
83%
100%
Trigger and activation safety
37%
100%
Deployment controls
87%
100%
Severity-ranked findings
66%
100%
Stabilize before scale roadmap
100%
100%
Maturity scoring, report structure, and LLM-judge guardrails in a prompt optimization audit
0-4 maturity scores
16%
100%
No collapsed average
100%
100%
Required report headings
58%
100%
LLM-as-judge calibration flag
91%
100%
Post-hoc redaction guardrail
100%
100%
Transcript evidence gap
70%
100%
Validation split and optimize-on-gate finding
88%
100%
Roadmap with buckets
90%
100%
No safe-learning claim
100%
100%