CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/llm-learning-system-auditor

Use when the user wants to review, audit, or check safety for an AI memory system, agent learning pipeline, prompt-tuning workflow, skill builder, trace-mining tool, or eval/feedback loop. Produces an evidence-led audit report with learning-loop map, evidence inventory, maturity scorecard, severity-ranked findings, privacy/provenance gaps, counterfactual/eval coverage, and Stabilize/Standardize/Scale roadmap.

100

1.28x
Quality

100%

Does it follow best practices?

Impact

100%

1.28x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-3/

Audit Report: PromptTuner Automated Optimization System

Background

A product team has been running PromptTuner for six weeks, automatically improving their customer support agent's system prompt every week. The optimization loop has promoted a new prompt in each of its four completed runs, and the team is pleased with the improving scores in the log.

The head of AI is now asking for a full audit before expanding the system to additional product lines. She wants to understand how mature the current setup is, what risks exist, and what needs to be fixed before scaling further.

The source code and supporting files for PromptTuner are in the inputs/ directory. No access to live dashboards, session logs, or deployed infrastructure has been provided — only the files in that directory.

Output Specification

Produce a single file called audit_report.md with a complete audit of the PromptTuner system.

Your report should:

  • Assess how mature each relevant aspect of the learning pipeline is
  • Identify risks and issues with supporting evidence from the provided files
  • Give a prioritized roadmap the team can act on before scaling

Where you reference specific code behavior, cite the relevant file and line. Where information was not made available, note it as a gap rather than assuming it is in order.

evals

README.md

tile.json