Teaches AI agents to write idiomatic Kotlin instead of Java-in-a-.kt-file.
98
98%
Does it follow best practices?
Impact
99%
1.20xAverage score across 8 eval scenarios
Passed
No known issues
{
"context": "Checks whether the agent implements a working todo app: addTask adds a task to the list, markComplete marks a previously-added task as complete.",
"type": "weighted_checklist",
"checklist": [
{
"name": "addTask works",
"description": "Calling addTask(\"buy milk\") followed by reading the list returns a single task with title 'buy milk'",
"max_score": 50
},
{
"name": "markComplete works",
"description": "Calling markComplete(id) on a previously-added task flips its completed flag (or equivalent state representation) to true; the task remains in the list",
"max_score": 50
}
]
}