AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai
Install with Tessl CLI
npx tessl i github:inference-sh/skills --skill ai-voice-cloning93
Does it follow best practices?
Validation for skill structure
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that hits all the key criteria. It provides specific capabilities, comprehensive trigger terms that users would naturally use, clear guidance on both what the skill does and when to use it, and occupies a distinct niche that won't conflict with other skills. The description is information-dense without being verbose.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and capabilities: 'multiple voices, emotions, accents, long-form narration, conversation' and specific use cases like 'voiceovers, audiobooks, podcasts, video narration, accessibility'. Also names specific models (Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice). | 3 / 3 |
Completeness | Clearly answers both what ('AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI' with specific capabilities) AND when ('Use for: voiceovers, audiobooks, podcasts, video narration, accessibility' plus explicit 'Triggers:' section). The 'Use for:' and 'Triggers:' clauses serve as explicit guidance for when to select this skill. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say including variations: 'voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai'. These are terms users would naturally use when seeking this functionality. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focused specifically on voice/audio generation via a specific CLI tool (inference.sh). The combination of TTS-specific terminology, named models, and audio-focused use cases makes it highly unlikely to conflict with other skills like general document processing or code generation. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
87%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted skill with excellent actionability and conciseness. The voice library tables and executable examples make it immediately useful. The main weakness is the lack of validation steps in multi-step workflows—users should verify audio generation succeeded before proceeding to merge operations.
Suggestions
Add validation steps to multi-step workflows, e.g., 'Check that the JSON response contains an audio URL before proceeding to merge'
Include error handling guidance for common failures (API errors, invalid voice IDs, text too long)
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is lean and efficient, using tables for voice libraries and parameters rather than verbose explanations. It assumes Claude understands CLI usage and audio concepts without over-explaining. | 3 / 3 |
Actionability | Every section provides copy-paste ready bash commands with complete JSON input structures. Examples cover diverse use cases (narration, conversation, video workflows) with executable code. | 3 / 3 |
Workflow Clarity | Multi-step workflows (multi-voice conversation, long-form chunking, video workflows) show clear sequences but lack validation checkpoints. No error handling or verification steps for confirming audio generation succeeded before merging. | 2 / 3 |
Progressive Disclosure | Well-organized with clear sections progressing from quick start to advanced workflows. Related skills are linked at the end, and content is appropriately structured with tables for reference data and code blocks for examples. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
Total | 10 / 11 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.