Name: markusdowne/agentmail
Rating: 100 (1 reviews)
Author: markusdowne

markusdowne/agentmail

Give AI agents their own email inboxes using the AgentMail API. Use when building email agents, sending/receiving emails programmatically, managing inboxes, handling attachments, organizing with labels, creating drafts for human approval, or setting up real-time notifications via webhooks/websockets. Supports multi-tenant isolation with pods.

100

1.20x

Quality

100%

Does it follow best practices?

Impact

100%

1.20x

Average score across 1 eval scenario

Securityby

Advisory

Suggest reviewing before use

{
  "context": "Tests whether the agent sends emails with both text and HTML bodies, validates API responses at each step, handles per-recipient failures independently without aborting the batch, uses idempotency keys, and applies appropriate error handling and labels.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Both text and html fields",
      "description": "Each send call includes both a 'text' (plain text) field AND an 'html' field — not one or the other",
      "max_score": 15
    },
    {
      "name": "Per-recipient error isolation",
      "description": "Failures for individual recipients are caught and logged without aborting the remaining sends — other recipients are still attempted after a failure",
      "max_score": 15
    },
    {
      "name": "Try/catch around send operations",
      "description": "Each send (or the loop body) is wrapped in a try/catch (TS) or try/except (Python) block",
      "max_score": 10
    },
    {
      "name": "Inbox creation validation",
      "description": "After creating the inbox, the script checks that inbox.inboxId (TS) or inbox.inbox_id (Python) is present before proceeding to send",
      "max_score": 10
    },
    {
      "name": "Send return value checked",
      "description": "After each send, the script checks that sent.messageId (TS) or sent.message_id (Python) is present to confirm the message was accepted",
      "max_score": 10
    },
    {
      "name": "Idempotency key on inbox creation",
      "description": "The inbox creation call includes a clientId (TS) or client_id (Python) parameter to ensure safe retries don't create duplicate inboxes",
      "max_score": 15
    },
    {
      "name": "Correct package import",
      "description": "The script imports from 'agentmail' (not from any other package name) and uses AgentMailClient (TS) or AgentMail (Python) as the client class",
      "max_score": 10
    },
    {
      "name": "Success/failure summary printed",
      "description": "The script prints (or logs) a summary at the end showing the count of succeeded sends and the count or list of failed addresses",
      "max_score": 10
    },
    {
      "name": "Labels applied to sent messages",
      "description": "The send call includes a labels field (e.g. a campaign or outreach label) on the outgoing messages",
      "max_score": 5
    }
  ]
}

evals

scenario-1

references

markusdowne/agentmail

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-1/

criteria.jsonevals/scenario-1/