CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/eval-setup

Generate eval scenarios from repo commits, configure multi-agent runs, execute baseline + with-context evals, and compare results — the full setup pipeline before improvement begins

90

3.37x
Quality

90%

Does it follow best practices?

Impact

91%

3.37x

Average score across 2 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

Evaluation results

79%

50%

Scenario 1

Criteria
Without context
With context

checks_prerequisites

50%

100%

browses_commits

0%

16%

auto_detects_context_files

0%

100%

uses_context_flag

50%

100%

workspace_in_eval_run

0%

100%

explains_baseline_vs_context

100%

100%

95%

70%

Scenario 2

Criteria
Without context
With context

does_not_use_last_only

0%

100%

finds_generation_ids

75%

100%

downloads_each_separately

33%

100%

explains_why

0%

75%

Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents