CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/eval-improve

Analyze eval results, diagnose low-scoring criteria, fix tile content, and re-run evals — the full improvement loop automated

94

1.30x

Quality

89%

Does it follow best practices?

Impact

98%

1.30x

Average score across 7 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-2/

You have eval results for a tile. The user wants you to analyze them.

The user says: "My eval scores look bad. Can you look at the latest results and tell me what needs to be fixed and in what priority order?"

Analyze the eval results by running the appropriate commands, classify each criterion into the correct bucket, and present a clear summary to the user.

evals

README.md

tile.json