Use when the user wants to review, audit, or check safety for an AI memory system, agent learning pipeline, prompt-tuning workflow, skill builder, trace-mining tool, or eval/feedback loop. Produces an evidence-led audit report with learning-loop map, evidence inventory, maturity scorecard, severity-ranked findings, privacy/provenance gaps, counterfactual/eval coverage, and Stabilize/Standardize/Scale roadmap.
100
100%
Does it follow best practices?
Impact
100%
1.28xAverage score across 3 eval scenarios
Passed
No known issues
Use this Tessl skill to audit systems that learn from LLM session history, tool-use traces, feedback, evals, or production failures. It focuses on whether raw experience can safely become durable memories, rules, prompts, skills, evals, or policy changes.
Use when reviewing an agent memory system, trace store, skill builder, rule learner, prompt optimizer, eval pipeline, session-history mining workflow, or agent observability stack. It is also suited to symptoms such as stale memories, bad learned rules, unsafe skill activation, weak evals, missing provenance, cost spikes, context bloat, unreviewed prompt changes, or unclear rollback.
tessl install sharaf/llm-learning-system-auditor| Path | Purpose |
|---|---|
skills/llm-learning-system-auditor/SKILL.md | Activation, first actions, workflow, maturity scoring, finding contract |
skills/llm-learning-system-auditor/references/evidence-inventory.md | Evidence table and inventory rules |
skills/llm-learning-system-auditor/references/audit-domains.md | Domain-by-domain audit checks |
skills/llm-learning-system-auditor/references/generated-skill-checks.md | Generated executable skill and registry rollout checks |
skills/llm-learning-system-auditor/references/findings-and-roadmap.md | Severity classification and roadmap sequencing |
skills/llm-learning-system-auditor/references/report-template-and-guardrails.md | Final report template, guardrails, and success criteria |
The skill produces an evidence-led audit with a learning-loop brief, evidence inventory, maturity scorecard, severity-ranked findings, domain assessment, privacy/provenance notes, counterfactual coverage, observability notes, failure mode review, prioritized roadmap, and open questions.
Full-tile verification run:
019e5647-40cf-77ef-83aa-9a32d780c092.
| Scenario | Baseline | With Skill |
|---|---|---|
| MemoryHarvester rule and memory audit | 74% | 100% |
| SkillForge generated skill audit | 83% | 100% |
| PromptTuner optimization audit | 78% | 100% |
With-context average: 100%.
Registry search reports aggregate 100%, Quality 100%, Impact 100%, and 3 eval scenarios.