CtrlK
BlogDocsLog inGet started
Tessl Logo

arn-spark-clickable-prototype-teams

This skill should be used when the user says "clickable prototype teams", "arn clickable prototype teams", "team clickable prototype", "debate clickable prototype", "collaborative interaction review", "clickable prototype with debate", "team-based interaction review", "interaction debate", "review interactions as a team", "interactive prototype teams", "team prototype review", or wants to create a clickable interactive prototype with linked screens and validate it through iterative expert debate cycles where product strategist and UX specialist discuss their scores and findings before producing a combined review, with Playwright-based interaction testing, per-criterion scoring, an independent judge verdict, and versioned output. Supports Agent Teams for parallel debate or sequential simulation as fallback. For standard lower-of-two-scores interaction review, use /arn-spark-clickable-prototype instead.

71

Quality

66%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/arn-spark/skills/arn-spark-clickable-prototype-teams/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is thorough in specifying what the skill does and when to use it, with strong disambiguation from a related skill. However, the long enumeration of trigger phrases feels mechanical and unnatural, and the description is overly verbose—it reads more like a specification than a concise, scannable description. The trigger terms are heavily curated rather than reflecting natural user language patterns.

Suggestions

Reduce the enumerated trigger phrase list to 3-4 representative examples and instead describe the category of requests more naturally (e.g., 'Use when the user wants team-based or debate-driven review of clickable prototypes').

Add more naturally occurring trigger terms users might actually say, such as 'expert review', 'UX debate', 'multi-reviewer prototype testing', or 'collaborative prototype validation'.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: create clickable interactive prototype with linked screens, validate through iterative expert debate cycles, Playwright-based interaction testing, per-criterion scoring, independent judge verdict, and versioned output.

3 / 3

Completeness

Clearly answers both 'what' (create clickable prototypes validated through expert debate cycles with Playwright testing, scoring, judge verdict, versioned output) and 'when' (explicit trigger phrases plus a 'Use when...' equivalent with the 'wants to create...' clause, and even a disambiguation note pointing to an alternative skill).

3 / 3

Trigger Term Quality

While it exhaustively lists exact trigger phrases users might say (e.g., 'clickable prototype teams', 'team-based interaction review'), these feel artificially enumerated rather than naturally occurring terms. Many are unlikely phrases a user would organically say, and common natural variations like 'prototype review with team debate' or 'multi-expert UX review' are missing.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive with its specific niche of team-based debate-driven prototype review. It even explicitly disambiguates from a related skill (/arn-spark-clickable-prototype) for standard interaction review, reducing conflict risk.

3 / 3

Total

11

/

12

Passed

Implementation

47%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill has excellent workflow clarity with well-structured multi-step processes, validation gates, and error recovery paths. However, it is severely over-verbose — the content could likely be cut by 50-60% without losing actionable information. The inline error handling table, repeated fallback explanations, and conversational prompt templates bloat the token cost significantly, which is ironic for a skill that discusses token cost trade-offs between debate modes.

Suggestions

Extract the Error Handling section, Agent Invocation Guide table, and prerequisite detection logic into separate reference files (e.g., references/error-handling.md, references/agent-guide.md) to reduce the main skill body by ~40%.

Remove redundant explanations — for example, the sequential debate invocation pattern is explained in Step 5c AND re-summarized in the Agent Invocation Guide table. Keep one authoritative location.

Condense the conversational prompt templates (the quoted blocks shown to users) — Claude can generate appropriate prompts from brief instructions rather than needing exact wording prescribed.

Add concrete agent invocation syntax examples showing the actual tool call format rather than abstract descriptions like 'invoke arn-spark-prototype-builder with: screen list, style brief, ...'

DimensionReasoningScore

Conciseness

This skill is extremely verbose at ~500+ lines. It extensively explains orchestration logic, fallback paths, and conversational prompts that could be dramatically condensed. Many sections repeat information (e.g., error handling restates what's already covered in the workflow steps). The Agent Teams check instructions, prerequisite resolution, and debate protocol explanations are far more detailed than needed for Claude.

1 / 3

Actionability

The skill provides concrete agent invocation patterns, file paths, and structured outputs (tables, file names, directory structures). However, it lacks executable code examples — the actual agent invocations are described abstractly ('invoke arn-spark-prototype-builder with...') rather than showing exact invocation syntax. The Bash commands are minimal (just echo for env var check).

2 / 3

Workflow Clarity

The multi-step workflow is thoroughly sequenced with clear step numbering (Steps 1-9), sub-steps (5a-5d), and explicit phases within the debate (Phases 1-4). Validation checkpoints are present throughout: divergence checks, file existence verification after Agent Teams, threshold checks after scoring, judge review as independent gate, and error recovery paths for each failure mode.

3 / 3

Progressive Disclosure

The skill references several external files (debate-protocol.md, journey-template.md, expert-interaction-review-template.md, debate-review-report-template.md, clickable-prototype-criteria.md, showcase-capture-guide.md) which is good progressive disclosure design. However, no bundle files were provided to verify these exist, and the SKILL.md itself is monolithic — the massive error handling section, agent invocation guide, and detailed prerequisite checks could be split into reference files rather than inlined.

2 / 3

Total

8

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (658 lines); consider splitting into references/ and linking

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
AppsVortex/arness
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.