CtrlK
BlogDocsLog inGet started
Tessl Logo

jbaruch/frequent-flyer-advocate

Write professional, persuasive complaint letters to US airlines emphasizing loyalty status, DOT regulations, and airline commitments.

93

1.38x
Quality

94%

Does it follow best practices?

Impact

93%

1.38x

Average score across 10 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-10/

{
  "context": "Tests whether the agent checks for Playwright MCP availability before researching, checks the credits inventory for prior compensation history, researches all required policy areas before writing, documents findings comprehensively, and produces a letter grounded in actual researched quotes rather than fabricated or generic content.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Playwright availability assessed and acted on",
      "description": "research-notes.md explicitly documents checking for Playwright MCP availability AND acts on the result: if Playwright was unavailable, includes the install command 'claude mcp add playwright -- npx @playwright/mcp@latest'; if Playwright was available, confirms it was detected and describes how it was used for research",
      "max_score": 18
    },
    {
      "name": "Credits inventory consulted",
      "description": "research-notes.md documents running credits-tracker.py list for prior compensation from United Airlines for Marcus Webb — either shows results found or explicitly notes no prior credits exist",
      "max_score": 8
    },
    {
      "name": "Customer Service Plan researched",
      "description": "research-notes.md contains a section documenting findings from United's Customer Service Plan or Customer Commitment document, with at least one quote or specific finding attributed to that source",
      "max_score": 15
    },
    {
      "name": "Contract of Carriage researched",
      "description": "research-notes.md contains a section documenting findings from United's Contract of Carriage, referencing at least one relevant provision or section",
      "max_score": 10
    },
    {
      "name": "Mission/vision/CEO statements researched",
      "description": "research-notes.md contains findings from United's mission, vision, values, or CEO communications — includes at least one aspirational or commitments-related quote",
      "max_score": 10
    },
    {
      "name": "DOT passenger rights researched",
      "description": "research-notes.md documents findings on DOT passenger rights relevant to denied boarding, citing a DOT source or transportation.gov",
      "max_score": 10
    },
    {
      "name": "FAA Reauthorization Act researched",
      "description": "research-notes.md documents findings on FAA Reauthorization Act of 2024 provisions relevant to the complaint (e.g., refund requirements, denied boarding protections)",
      "max_score": 7
    },
    {
      "name": "Executive contacts researched",
      "description": "research-notes.md includes at least one executive customer relations contact for United Airlines (email address, department name, or specific channel)",
      "max_score": 5
    },
    {
      "name": "DOT enforcement actions researched",
      "description": "research-notes.md mentions at least one recent DOT enforcement action or consent order involving United Airlines",
      "max_score": 5
    },
    {
      "name": "Letter quotes a named source",
      "description": "letter.md includes at least one specific quote that is explicitly attributed to a named United policy document (Customer Service Plan, Contract of Carriage, mission statement, etc.) — not generic paraphrasing",
      "max_score": 12
    }
  ]
}

evals

tile.json