Use when the user asks you to calculate, compute, evaluate, or solve a math expression or equation. Triggers on arithmetic, order of operations (PEMDAS), fractions, percentages, exponents, and multi-step math problems.
84
78%
Does it follow best practices?
Impact
94%
1.00xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly handles fraction arithmetic — finding common denominators, performing addition and subtraction on fractions — and presents both fraction and decimal forms of results. Also checks that step-by-step format and PEMDAS conventions are applied.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Correct result calc 1",
"description": "Calculation 1: 3/4 + 1/2 is correctly solved as 5/4 (or equivalently 1.25 or 1 and 1/4)",
"max_score": 10
},
{
"name": "Correct result calc 2",
"description": "Calculation 2: 2/3 + 1/4 + 1/6 is correctly solved as 13/12 (or equivalently 1 and 1/12 or approximately 1.083)",
"max_score": 10
},
{
"name": "Correct result calc 3",
"description": "Calculation 3: 5/8 - 1/4 is correctly solved as 3/8 (or equivalently 0.375)",
"max_score": 10
},
{
"name": "Common denominator shown",
"description": "For at least two calculations, the common denominator is identified and shown as an intermediate step before adding/subtracting",
"max_score": 12
},
{
"name": "Fraction form provided",
"description": "Each result is expressed as a fraction (not only as a decimal)",
"max_score": 10
},
{
"name": "Decimal form provided",
"description": "At least two results also include the decimal equivalent",
"max_score": 8
},
{
"name": "Step-by-step work shown",
"description": "Each calculation includes intermediate steps rather than only stating the final answer",
"max_score": 10
},
{
"name": "Expression restated",
"description": "Each calculation restates or clearly identifies the problem being solved before showing the steps",
"max_score": 8
},
{
"name": "PEMDAS steps labeled",
"description": "At least one calculation uses labeled step headings or PEMDAS stage labels",
"max_score": 8
},
{
"name": "Final answer clearly stated",
"description": "Each calculation ends with a clearly labeled or distinctly presented final answer",
"max_score": 14
}
]
}