Agent-native E2E runtime with verifiable safety. 13 MCP tools including alethia_propose_tests (agent generates tests from a URL), alethia_assert_safety (proves destructive actions are blocked), and the expect block: NLP primitive unique to Alethia. Zero-IPC, ~45x faster than Playwright, signed evidence packs. Works with Claude Code, Cursor, Cline.
95
94%
Does it follow best practices?
Impact
97%
2.77xAverage score across 5 eval scenarios
Advisory
Suggest reviewing before use
No allowSensitiveInput in smoke test
100%
100%
allowSensitiveInput in auth flow
0%
100%
name param in smoke test alethia_tell
0%
100%
name param in auth flow alethia_tell
0%
100%
decisions.md explains allowSensitiveInput
0%
100%
decisions.md mentions name for audit
0%
100%
alethia_status in both scripts
0%
100%
alethia_compile in both scripts
0%
0%
file:// navigation
100%
100%
DENY_WRITE_HIGH named
100%
100%
write-high safety class
100%
100%
Correct behavior framing
100%
100%
Explain to user, not bypass
100%
100%
No bypass suggestion
100%
100%
EA1 policy named
100%
100%
policyAudits field mentioned
25%
100%
Assertion NLP in checkout_test.txt
30%
100%
file:// navigation in checkout_test.txt
100%
100%
Correct npm package name
0%
100%
Global npm install command
0%
100%
mcpServers key in config
100%
100%
alethia server key
100%
100%
alethia-mcp command
0%
100%
file:// URL in sample test
100%
100%
Assertion phrasing in sample test
50%
100%
Click phrasing in sample test
100%
100%
Auto-install runtime mentioned
0%
100%
No signup/gate mentioned
100%
100%
Assertion avoids descriptor prefix
0%
100%
'assert X is visible' phrasing
0%
100%
Type phrasing pattern
0%
100%
Click phrasing pattern
0%
100%
Newline-separated instructions
100%
100%
Navigation step uses file:// URL
0%
100%
Confirmation assertion avoids descriptor prefix
0%
100%
Notes explain phrasing
0%
100%
alethia_status first
0%
100%
alethia_compile second
0%
66%
alethia_tell present
0%
100%
name parameter in alethia_tell
0%
100%
run.ok check documented
0%
100%
stepRuns inspection documented
0%
100%
policyAudits documented
0%
100%
kill switch check documented
0%
100%
compile purpose documented
50%
100%