Analyse human-AI collaboration patterns and compute quality metrics from captured session data.
88
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
Loading evals