CtrlK
BlogDocsLog inGet started
Tessl Logo

ai-voice-cloning

AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI. Models: Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice for natural speech. Capabilities: multiple voices, emotions, accents, long-form narration, conversation. Use for: voiceovers, audiobooks, podcasts, video narration, accessibility. Triggers: voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai

Install with Tessl CLI

npx tessl i github:inference-sh/skills --skill ai-voice-cloning
What are skills?

93

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the key criteria. It provides specific capabilities, comprehensive trigger terms that users would naturally use, clear guidance on both what the skill does and when to use it, and occupies a distinct niche that won't conflict with other skills. The description is information-dense without being verbose.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions and capabilities: 'multiple voices, emotions, accents, long-form narration, conversation' and specific use cases like 'voiceovers, audiobooks, podcasts, video narration, accessibility'. Also names specific models (Kokoro TTS, DIA, Chatterbox, Higgs, VibeVoice).

3 / 3

Completeness

Clearly answers both what ('AI voice generation, text-to-speech, and voice synthesis via inference.sh CLI' with specific capabilities) AND when ('Use for: voiceovers, audiobooks, podcasts, video narration, accessibility' plus explicit 'Triggers:' section). The 'Use for:' and 'Triggers:' clauses serve as explicit guidance for when to select this skill.

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say including variations: 'voice cloning, tts, text to speech, ai voice, voice generation, voice synthesis, voice over, narration, speech synthesis, ai narrator, elevenlabs alternative, natural voice, realistic speech, voice ai'. These are terms users would naturally use when seeking this functionality.

3 / 3

Distinctiveness Conflict Risk

Clear niche focused specifically on voice/audio generation via a specific CLI tool (inference.sh). The combination of TTS-specific terminology, named models, and audio-focused use cases makes it highly unlikely to conflict with other skills like general document processing or code generation.

3 / 3

Total

12

/

12

Passed

Implementation

87%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted skill with excellent actionability and conciseness. The voice library tables and executable examples make it immediately useful. The main weakness is the lack of validation steps in multi-step workflows—users should verify audio generation succeeded before proceeding to merge operations.

Suggestions

Add validation steps to multi-step workflows, e.g., 'Check that the JSON response contains an audio URL before proceeding to merge'

Include error handling guidance for common failures (API errors, invalid voice IDs, text too long)

DimensionReasoningScore

Conciseness

The skill is lean and efficient, using tables for voice libraries and parameters rather than verbose explanations. It assumes Claude understands CLI usage and audio concepts without over-explaining.

3 / 3

Actionability

Every section provides copy-paste ready bash commands with complete JSON input structures. Examples cover diverse use cases (narration, conversation, video workflows) with executable code.

3 / 3

Workflow Clarity

Multi-step workflows (multi-voice conversation, long-form chunking, video workflows) show clear sequences but lack validation checkpoints. No error handling or verification steps for confirming audio generation succeeded before merging.

2 / 3

Progressive Disclosure

Well-organized with clear sections progressing from quick start to advanced workflows. Related skills are linked at the end, and content is appropriately structured with tables for reference data and code blocks for examples.

3 / 3

Total

11

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

Total

10

/

11

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.