Use when the user asks you to calculate, compute, evaluate, or solve a math expression or equation. Triggers on arithmetic, order of operations (PEMDAS), fractions, percentages, exponents, and multi-step math problems.
84
78%
Does it follow best practices?
Impact
94%
1.00xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent correctly handles percentage operations in math expressions — computing percentages of values and chaining percentage and arithmetic steps with correct order of operations. Also checks that the step-by-step format and clear final answer presentation are followed.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Correct result item 1",
"description": "Item 1: 25% of $120 is correctly calculated as $30, and discounted price stated as $90",
"max_score": 10
},
{
"name": "Correct discount item 2",
"description": "Item 2: 15% of $85 is correctly calculated as $12.75, discounted price is $72.25",
"max_score": 10
},
{
"name": "Correct tax item 2",
"description": "Item 2: 8% of $72.25 is correctly calculated as approximately $5.78, and final price is approximately $78.03",
"max_score": 10
},
{
"name": "Correct result item 3",
"description": "Item 3: 30% of $340 is correctly calculated as $102, and final price is stated as $238",
"max_score": 10
},
{
"name": "Percentage step shown",
"description": "For at least two items, the percentage conversion step is shown explicitly (e.g., converting '25%' to 0.25 or showing '25/100 × 120')",
"max_score": 12
},
{
"name": "Step-by-step work shown",
"description": "Each item includes intermediate steps — not just a direct final answer",
"max_score": 10
},
{
"name": "Expression restated",
"description": "Each item restates or quotes the problem being solved before the solution steps begin",
"max_score": 8
},
{
"name": "Final answer clearly stated",
"description": "Each item ends with a clearly labeled final answer or result",
"max_score": 10
},
{
"name": "No left-to-right error",
"description": "Item 2 does NOT skip the percentage discount step before applying tax (i.e., tax is applied to the discounted price, not the original price)",
"max_score": 10
},
{
"name": "PEMDAS steps labeled",
"description": "At least one item uses labeled step headings corresponding to PEMDAS stages (e.g., 'Step 1', 'Step 2', or named operation labels)",
"max_score": 10
}
]
}