Name: vitron-ai/alethia
Rating: 95.6 (1 reviews)
Author: vitron-ai

vitron-ai/alethia

Agent-native E2E runtime with verifiable safety. 16 MCP tools including alethia_propose_tests (agent generates tests from a URL), alethia_assert_safety (proves destructive actions are blocked), and the expect block: NLP primitive unique to Alethia. Zero-IPC; 2-5x faster than Playwright MCP per flow; signed evidence packs. Works with Claude Code, Cursor, Cline.

2.80x

Quality

94%

Does it follow best practices?

Impact

98%

2.80x

Average score across 5 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly communicates both capabilities and trigger conditions. It opens with an explicit 'Use when...' clause covering diverse user intents, lists concrete actions and specific output types, and occupies a distinct niche combining browser automation with safety auditing. The description is well-structured, concise, and uses appropriate third-person voice throughout.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: run E2E tests, verify a web page, generate tests, prove destructive actions are blocked, check UI element visibility, fill out a form, drive a browser. Also specifies concrete outputs: per-step results, safety classifications, policy decisions, DOM diffs, structured page context, signed audit trail.	3 / 3
Completeness	Clearly answers both 'what' (runs E2E tests, verifies web pages, generates tests, checks UI elements, fills forms, drives browser; returns step results with safety classifications, DOM diffs, audit trail) and 'when' (explicit 'Use when...' clause listing multiple trigger scenarios).	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'E2E tests', 'web page', 'generate tests', 'UI element', 'fill out a form', 'drive a browser', 'natural language'. These are terms users would naturally use when requesting browser automation or end-to-end testing.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with a clear niche: browser-driven E2E testing with safety classifications and audit trails. The combination of browser automation, testing, safety/policy decisions, and DOM diffs creates a unique profile unlikely to conflict with generic testing or web scraping skills.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted skill that provides highly actionable, clearly structured guidance for driving the Alethia MCP server. The three primary workflows are well-sequenced with validation checkpoints, the NLP phrasing guide gives exact syntax, and structured response fields are documented for agent self-repair. Minor verbosity in the trigger terms list and a few explanatory asides prevent a perfect conciseness score, but overall token efficiency is good.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient and avoids explaining basic concepts, but some sections are slightly verbose — e.g., the 'Use when' trigger terms list is long, and some explanatory prose (like 'This is the primitive that makes Alethia a verifiable-safety framework, not just a test runner') could be trimmed. The safety classifications table and reason codes are well-structured and earn their tokens.	2 / 3
Actionability	The skill provides concrete, copy-paste-ready NLP phrasings, executable JSON config, specific tool call sequences, and structured response field names. Every workflow has specific tool names and clear expected outputs. The NLP phrasing guide is particularly actionable with exact syntax examples.	3 / 3
Workflow Clarity	Three distinct workflows (A, B, C) are clearly sequenced with numbered steps. Workflow A includes a self-repair loop (read suggestedFix, retry). Workflow B has explicit validation (passed: true/false) with clear guidance on what to do when validation fails. The status check as step 1 serves as a liveness checkpoint.	3 / 3
Progressive Disclosure	The skill provides a concise overview with clear one-level-deep references to docs/index.md and rules/alethia.md. Content is well-organized into logical sections (workflows, NLP guide, response fields, safety, limitations) without being monolithic. The inline content is appropriately scoped for what an agent needs during execution.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

21 days ago

Table of Contents

Discovery Implementation Validation