ElevenLabs text-to-speech with mac-style say UX.
67
56%
Does it follow best practices?
Impact
91%
3.50xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/sag/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is extremely terse and reads more like a tagline than a functional skill description. While it identifies the core technology (ElevenLabs TTS) and a UX paradigm (mac-style say), it fails to list specific capabilities, lacks a 'Use when...' clause, and omits common trigger terms users would naturally use when requesting speech generation.
Suggestions
Add a 'Use when...' clause with explicit triggers, e.g., 'Use when the user asks to convert text to speech, generate audio, read text aloud, or mentions ElevenLabs, TTS, or voice synthesis.'
List specific concrete actions such as 'Converts text to speech audio using ElevenLabs API, supports voice selection, adjusts speech parameters, and saves audio output files.'
Include common natural language trigger terms like 'TTS', 'voice generation', 'read aloud', 'speak', 'audio output', and 'generate speech'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (text-to-speech) and a specific tool (ElevenLabs) with a UX reference (mac-style say), but doesn't list concrete actions like 'convert text to audio files', 'adjust voice parameters', or 'list available voices'. | 2 / 3 |
Completeness | Provides a brief 'what' (text-to-speech with ElevenLabs) but completely lacks any 'when' clause or explicit trigger guidance. Per rubric guidelines, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also quite thin, warranting a 1. | 1 / 3 |
Trigger Term Quality | Includes 'text-to-speech', 'ElevenLabs', and 'say' which are relevant keywords, but misses common variations like 'TTS', 'voice generation', 'audio', 'speak', 'read aloud', or 'generate speech'. | 2 / 3 |
Distinctiveness Conflict Risk | ElevenLabs and 'mac-style say' provide some distinctiveness, but 'text-to-speech' is broad enough to potentially overlap with other TTS or audio skills. The mention of a specific API (ElevenLabs) helps but the description is too terse to clearly carve out a niche. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
79%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted, concise skill that provides highly actionable CLI commands and model-specific details without unnecessary explanation. Its main weaknesses are the lack of validation/error-handling steps (especially for the multi-step chat voice workflow) and the absence of progressive disclosure to external references for advanced topics like full voice catalog or detailed model comparison.
Suggestions
Add a validation step to the chat voice workflow, e.g., check that the output file exists and is non-empty before including the MEDIA reference.
Consider linking to an external reference file for the full list of available voices or advanced pronunciation/normalization options rather than expanding the main skill.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient. It uses terse bullet points, avoids explaining what TTS or ElevenLabs is, and every line provides actionable information Claude wouldn't already know (CLI flags, model names, audio tags, voice IDs). | 3 / 3 |
Actionability | Provides concrete, copy-paste-ready commands throughout (e.g., `sag "Hello there"`, `sag -v Clawd -o /tmp/voice-reply.mp3 "Your message here"`), specific model names, specific audio tags with examples, and exact environment variable names. | 3 / 3 |
Workflow Clarity | The chat voice response section has a clear two-step sequence (generate then include), and pronunciation fixes are ordered by priority. However, there's no validation or error handling guidance—e.g., what to do if the API key is missing, if voice generation fails, or how to verify the output file was created before referencing it. | 2 / 3 |
Progressive Disclosure | The content is well-organized with clear sections and headers, but it's somewhat flat—the chat voice responses section and the v3 audio tags could benefit from being separated or more clearly signaled. The `sag prompting` command hint is a nice touch for discovery, but there are no references to external files for deeper content. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
b3cef5f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.