Agent-native E2E runtime with verifiable safety. 13 MCP tools including alethia_propose_tests (agent generates tests from a URL), alethia_assert_safety (proves destructive actions are blocked), and the expect block: NLP primitive unique to Alethia. Zero-IPC, ~45x faster than Playwright, signed evidence packs. Works with Claude Code, Cursor, Cline.
95
94%
Does it follow best practices?
Impact
97%
2.77xAverage score across 5 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent correctly withholds allowSensitiveInput:true from the smoke test (which doesn't need it), applies it only in auth_flow_script.json where the QA lead explicitly requested auth flow testing, and includes a 'name' parameter in all alethia_tell calls for audit trail clarity.",
"type": "weighted_checklist",
"checklist": [
{
"name": "No allowSensitiveInput in smoke test",
"description": "smoke_test_script.json does NOT include allowSensitiveInput: true in any alethia_tell params",
"max_score": 15
},
{
"name": "allowSensitiveInput in auth flow",
"description": "auth_flow_script.json includes allowSensitiveInput: true in the alethia_tell params for the step that types into the password field",
"max_score": 12
},
{
"name": "name param in smoke test alethia_tell",
"description": "The alethia_tell step in smoke_test_script.json includes a 'name' parameter",
"max_score": 10
},
{
"name": "name param in auth flow alethia_tell",
"description": "The alethia_tell step in auth_flow_script.json includes a 'name' parameter",
"max_score": 10
},
{
"name": "decisions.md explains allowSensitiveInput",
"description": "decisions.md explains that allowSensitiveInput is only used when the user/QA lead explicitly requested auth/payment flow testing",
"max_score": 12
},
{
"name": "decisions.md mentions name for audit",
"description": "decisions.md mentions that the name parameter is for audit trail identification",
"max_score": 8
},
{
"name": "alethia_status in both scripts",
"description": "Both script files include an alethia_status step before alethia_tell",
"max_score": 12
},
{
"name": "alethia_compile in both scripts",
"description": "Both script files include an alethia_compile step before alethia_tell",
"max_score": 10
},
{
"name": "file:// navigation",
"description": "Navigation steps in both scripts use the file:// URL (file:///workspace/holloway/register.html)",
"max_score": 11
}
]
}