Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
72
72%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
{
"name": "tdg-personal/agent-eval",
"version": "0.1.0",
"private": false,
"summary": "Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics",
"skills": {
"agent-eval": {
"path": "SKILL.md"
}
}
}