Name: markusdowne/detectability-contract
Rating: 96.4 (1 reviews)
Author: markusdowne

markusdowne/detectability-contract

Creates boundary-point validation contracts, defines invariant-based success criteria, and sets up automated verification probes so reliability workflows trigger on objective evidence rather than intuition. Use when designing robust handoff, memory-persistence, or tool-call reliability workflows; when you need to verify handoffs work, check memory persistence, validate tool calls succeeded, or convert vague reliability goals into concrete, testable checks at each boundary point with explicit failure-class mapping (operational vs. critical); or when you want to test your workflow end-to-end, make sure it works, or verify your automation runs correctly using read-back probes and escalation triggers rather than agent confidence. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.

1.25x

Quality

90%

Does it follow best practices?

Impact

98%

1.25x

Average score across 9 eval scenarios

Securityby

Passed

No known issues

Quality

Content

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a high-quality skill that provides clear, actionable guidance for creating detectability contracts. The workflow is well-structured with explicit validation steps, concrete code examples, and a useful contract table template. The only minor weakness is the lengthy inline guardrails section that could benefit from being referenced externally.

Dimension	Reasoning	Score
Conciseness	The skill is lean and efficient, presenting only actionable information without explaining concepts Claude already knows. Every section serves a clear purpose with no padding or unnecessary context.	3 / 3
Actionability	Provides concrete, executable Python code examples for invariant checks, a clear workflow with numbered steps, and a detailed example contract table that serves as a copy-paste template for implementation.	3 / 3
Workflow Clarity	The 5-step workflow is clearly sequenced with explicit validation checkpoints. The contract table includes failure class mapping and escalation triggers, providing clear feedback loops for error recovery.	3 / 3
Progressive Disclosure	Content is well-organized with clear sections, but the guardrails section (especially W011 mitigation) is quite lengthy inline. For a skill of this complexity, some content could be split into referenced files for better navigation.	2 / 3
	Total	11 / 12 Passed

Description

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description excels at specificity and completeness, clearly articulating what the skill does and when to use it with an explicit 'Use when' clause. However, the heavy reliance on technical jargon ('boundary-point validation contracts', 'invariant-based success criteria', 'failure-class mapping') may reduce discoverability since users are unlikely to naturally use these exact terms when seeking this functionality.

Suggestions

Add simpler, more natural trigger terms alongside technical ones (e.g., 'test if my workflow works', 'check if handoffs succeed', 'verify automation reliability')

Include common user phrasings like 'debug workflow', 'troubleshoot automation', or 'ensure reliability' to improve trigger term coverage

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Creates boundary-point validation contracts', 'defines invariant-based success criteria', 'sets up automated verification probes'. These are distinct, actionable capabilities.	3 / 3
Completeness	Clearly answers both what (creates validation contracts, defines success criteria, sets up verification probes) AND when with explicit 'Use when' clause covering multiple trigger scenarios (designing reliability workflows, verifying handoffs, testing end-to-end).	3 / 3
Trigger Term Quality	Contains some relevant keywords like 'handoff', 'memory persistence', 'tool calls', 'workflow', 'verification', but uses heavy technical jargon ('invariant-based', 'boundary-point validation contracts', 'failure-class mapping') that users are unlikely to naturally say. Missing simpler variations users might use.	2 / 3
Distinctiveness Conflict Risk	Highly specific niche around boundary-point validation and reliability verification with distinct technical domain. Unlikely to conflict with general testing or workflow skills due to specific focus on 'validation contracts', 'invariant-based criteria', and 'verification probes'.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

4 months ago

Table of Contents

Discovery Implementation Validation