Test a Paddle integration end-to-end using the sandbox environment, test cards, the webhook simulator, and local tunnels — without taking real money.
67
80%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/sandbox-testing/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, specific description that clearly identifies the domain (Paddle payment integration testing) and lists concrete tools and techniques involved. Its main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know exactly when to select this skill. The trigger terms are excellent and highly specific to the domain.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to test Paddle payments, simulate webhooks, or verify a Paddle checkout integration in sandbox mode.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: testing Paddle integration end-to-end, using sandbox environment, test cards, webhook simulator, and local tunnels. Also specifies the constraint 'without taking real money.' | 3 / 3 |
Completeness | Clearly answers 'what does this do' (test a Paddle integration using sandbox, test cards, webhook simulator, local tunnels), but lacks an explicit 'Use when...' clause or equivalent trigger guidance, which caps this at 2 per the rubric. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'Paddle', 'sandbox', 'test cards', 'webhook simulator', 'local tunnels', 'end-to-end', and 'integration'. These are terms a developer working with Paddle payments would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Very distinct niche — specifically about Paddle payment integration testing with sandbox tools. Unlikely to conflict with other skills due to the specificity of 'Paddle', 'sandbox environment', 'test cards', and 'webhook simulator'. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, actionable skill that provides a clear end-to-end testing workflow with concrete card numbers, commands, and verification steps. Its main weakness is length — the sandbox-vs-live comparison tables, MCP server details, and common pitfalls section add bulk that could be offloaded to supporting files. The workflow clarity is excellent with explicit checkpoints at each stage.
Suggestions
Move the MCP server usage block (the large blockquote about search/execute/report_missing_tool) to a separate reference file — it's a tangent from the core testing workflow.
Consider moving the 'Sandbox vs live: differences that may catch you out' table to a separate file and linking to it, keeping only a brief summary inline.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is generally well-structured but includes some unnecessary verbosity — the sandbox vs production table is quite detailed for things Claude would infer, the MCP server block is a large tangent, and the 'Common pitfalls' section restates some points already covered. The comparison tables are useful but could be tighter. | 2 / 3 |
Actionability | Provides concrete test card numbers, exact env variable names with prefixes, executable code for the simulator API, specific CLI commands for tunneling, and a step-by-step end-to-end test flow with explicit confirmation checks. Highly copy-paste ready. | 3 / 3 |
Workflow Clarity | The end-to-end test in Step 5 is a clear numbered sequence with explicit validation checkpoints (confirm browser redirect, check server logs, verify DB rows, check dashboard). The overall 5-step structure flows logically from setup through testing, with feedback loops for verification at each stage. | 3 / 3 |
Progressive Disclosure | The skill references related docs and other skills (catalog-setup, checkout-web, webhooks, subscription-sync) which is good, but the content itself is quite long and monolithic. The MCP server usage block and the sandbox-vs-live differences table could be split into separate reference files. No bundle files exist to offload detail into. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
62438cd
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.