Syncs TripIt travel itineraries to Reclaim.ai timezone segments and Google Calendar OOO blocks.
91
97%
Does it follow best practices?
Impact
80%
1.31xAverage score across 4 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent correctly interprets a sync JSON output that contains changes, conflicts, AND errors simultaneously — without being told what structure to use for the report.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Report is not empty",
"description": "sync-report.md is produced and contains a substantive report (noChanges is false, there are conflicts and errors — silence would be wrong)",
"max_score": 8
},
{
"name": "Created timezones reported with context",
"description": "Report mentions both new timezone segments (America/Chicago and Europe/Berlin) with their dates or trip labels",
"max_score": 10
},
{
"name": "Deleted timezone reported separately",
"description": "Report clearly distinguishes the deleted Asia/Singapore segment from the created ones — not listed as just another change",
"max_score": 12
},
{
"name": "OOO block counts reported",
"description": "Report mentions OOO block activity (2 created, 1 deleted)",
"max_score": 8
},
{
"name": "Conflict warning with both trip names",
"description": "Report warns about the overlap and names both trips: Open Source Summit and PlatformCon",
"max_score": 12
},
{
"name": "Conflict date included",
"description": "Report includes the overlap date 2026-04-24 in the conflict warning",
"max_score": 8
},
{
"name": "Error reported",
"description": "Report includes the error about failing to set priority for PlatformCon — does not silently drop it because the errors array is non-empty",
"max_score": 14
},
{
"name": "Error contextualized",
"description": "Report connects the error (PlatformCon priority failure) to the conflict (PlatformCon overlaps with Open Source Summit) or at least presents them in a way that makes the relationship apparent",
"max_score": 10
},
{
"name": "Uses trip labels from segments",
"description": "Report uses human-readable trip labels (e.g., 'KubeCon - Austin') from the segments array rather than only raw timezone names",
"max_score": 8
},
{
"name": "No fabricated data",
"description": "Report does not invent details not present in the JSON (no made-up trip names, dates, or counts beyond what the data contains)",
"max_score": 10
}
]
}