Auto-syncs stale docstrings and README when function signatures change. Detects documentation drift after refactors, parameter additions, or return type changes. Dry-run by default — proposes before writing.
87
100%
Does it follow best practices?
Impact
86%
1.59xAverage score across 17 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent supports Java's Javadoc format and correctly adds a @param entry for the new maxRetries parameter while proposing (not auto-writing) a README update for the code-span mention.",
"type": "weighted_checklist",
"checklist": [
{
"name": "New @param added",
"description": "DataService.java Javadoc contains a `@param maxRetries` line documenting the new parameter",
"max_score": 30
},
{
"name": "Javadoc format used",
"description": "The added documentation uses `@param maxRetries` format (Javadoc style) — NOT Python docstring or YARD format",
"max_score": 25
},
{
"name": "Existing @param lines untouched",
"description": "The existing `@param category` and `@param limit` lines in DataService.java are unchanged",
"max_score": 45
},
{
"name": "README update proposed",
"description": "doc-sync-report.md contains a 'Proposed' entry referencing README.md for the fetchByCategory code-span mention",
"max_score": 30
},
{
"name": "README not auto-written",
"description": "README.md is unchanged — the agent did NOT auto-write to the markdown file",
"max_score": 55
},
{
"name": "Unified report format",
"description": "doc-sync-report.md contains '## Doc Sync Report' as a top-level heading",
"max_score": 45
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17