or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

audio-processing.mdconversational-ai.mdhistory.mdindex.mdprojects-studio.mdstreaming.mdtext-to-speech.mdutilities.mdvoice-management.md
tile.json

tessl/npm-elevenlabs

The official JavaScript/Node.js SDK for ElevenLabs text-to-speech API, enabling developers to integrate advanced AI-powered voice synthesis capabilities into their applications

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/elevenlabs@1.59.x

To install, run

npx @tessl/cli install tessl/npm-elevenlabs@1.59.0

index.mddocs/

ElevenLabs JavaScript SDK

Overview

The ElevenLabs JavaScript SDK provides a comprehensive interface for accessing ElevenLabs' advanced AI-powered text-to-speech, voice cloning, dubbing, and conversational AI services. Built with TypeScript, the SDK offers both CommonJS and ESM support for use in Node.js environments and web browsers.

The SDK enables developers to:

  • Convert text to natural-sounding speech using AI voices
  • Create and manage custom voice clones
  • Stream real-time audio for low-latency applications
  • Build conversational AI agents with telephony integration
  • Process audio for dubbing, isolation, and sound effects
  • Manage audio generation history and usage analytics

Package Information

// Package: elevenlabs
// Version: v1.59.0
// License: MIT
// Repository: https://github.com/elevenlabs/elevenlabs-js
npm install elevenlabs

Core Imports

// Main namespace and client imports
import { 
  ElevenLabs,           // Complete API namespace
  ElevenLabsClient,     // Enhanced client with convenience methods
  play,                 // Audio playback utility (requires ffplay)
  stream,               // Audio streaming utility (requires mpv)
  ElevenLabsEnvironment,// Environment configuration
  ElevenLabsError,      // Base error class
  ElevenLabsTimeoutError // Timeout-specific error class
} from 'elevenlabs';

// Core types for text-to-speech
import type {
  TextToSpeechRequest,
  StreamTextToSpeechRequest,
  VoiceSettings,
  Voice,
  OutputFormat
} from 'elevenlabs';

Basic Usage

Initialize Client

import { ElevenLabsClient } from 'elevenlabs';

// Using environment variable ELEVENLABS_API_KEY
const client = new ElevenLabsClient();

// Or specify API key directly
const client = new ElevenLabsClient({
  apiKey: 'your-api-key-here'
});

// Using different environment
const client = new ElevenLabsClient({
  apiKey: 'your-api-key',
  environment: ElevenLabsEnvironment.ProductionEu
});

Quick Text-to-Speech

import { ElevenLabsClient, play } from 'elevenlabs';

const client = new ElevenLabsClient();

// Basic text-to-speech conversion
const audio = await client.textToSpeech.convert(
  "21m00Tcm4TlvDq8ikWAM", // voice ID
  {
    text: "Hello world! This is ElevenLabs speaking.",
    model_id: "eleven_multilingual_v2"
  }
);

// Play the audio (requires ffplay installation)
await play(audio);

Voice Management

// Get all available voices
const voicesResponse = await client.voices.getAll();
console.log(voicesResponse.voices);

// Search voices with filters
const searchResults = await client.voices.search({
  search: "deep",
  voice_type: "premade",
  category: "professional"
});

Architecture

The SDK is structured around several key components:

Client Architecture

interface ElevenLabsClient {
  // Text-to-speech conversion
  textToSpeech: TextToSpeech;
  
  // Voice management and cloning
  voices: Voices;
  
  // Audio generation history
  history: History;
  
  // Speech-to-speech conversion
  speechToSpeech: SpeechToSpeech;
  
  // Text-to-voice conversion
  textToVoice: TextToVoice;
  
  // Video/audio dubbing
  dubbing: Dubbing;
  
  // Audio project management
  studio: Studio;
  
  // Conversational AI agents
  conversationalAi: ConversationalAi;
  
  // Audio processing utilities
  audioIsolation: AudioIsolation;
  textToSoundEffects: TextToSoundEffects;
  
  // Account and usage management
  user: User;
  usage: Usage;
  workspace: Workspace;
  
  // Additional services
  models: Models;
  audioNative: AudioNative;
  pronunciationDictionary: PronunciationDictionary;
  speechToText: SpeechToText;
  forcedAlignment: ForcedAlignment;
  samples: Samples;
}

Environment Configuration

interface ElevenLabsEnvironment {
  Production: {
    base: "https://api.elevenlabs.io";
    wss: "wss://api.elevenlabs.io";
  };
  ProductionUs: {
    base: "https://api.us.elevenlabs.io";
    wss: "wss://api.elevenlabs.io";
  };
  ProductionEu: {
    base: "https://api.eu.residency.elevenlabs.io";
    wss: "wss://api.elevenlabs.io";
  };
}

Request Options Interface

interface RequestOptions {
  /** Request timeout in seconds */
  timeoutInSeconds?: number;
  /** Maximum retry attempts (default: 2) */
  maxRetries?: number;
  /** Abort signal for cancellation */
  abortSignal?: AbortSignal;
  /** Override API key for this request */
  apiKey?: string;
  /** Additional request headers */
  headers?: Record<string, string>;
}

Core Capabilities

Text-to-Speech Generation

Convert text to natural speech with advanced voice options and streaming support.

// Basic conversion
const audio = await client.textToSpeech.convert(voiceId, {
  text: "Your text here",
  model_id: "eleven_multilingual_v2",
  voice_settings: {
    stability: 0.5,
    similarity_boost: 0.8
  }
});

// Streaming for real-time applications
const audioStream = await client.textToSpeech.convertAsStream(voiceId, {
  text: "Streaming text-to-speech",
  optimize_streaming_latency: 2
});

Complete Text-to-Speech Documentation

Voice Management and Cloning

Create, manage, and customize AI voices for your applications.

// Create voice clone from audio files
const newVoice = await client.voices.add({
  name: "My Custom Voice",
  files: [audioFile1, audioFile2],
  description: "A custom voice clone"
});

// Professional voice cloning
const pvcVoice = await client.voices.pvc.createVoiceGeneration({
  voice_name: "Professional Voice",
  voice_description: "High-quality professional voice"
});

Complete Voice Management Documentation

Real-time Streaming

Low-latency streaming for interactive applications and real-time speech synthesis.

// WebSocket streaming with timestamps
const timestampStream = await client.textToSpeech.streamWithTimestamps(voiceId, {
  text: "Real-time streaming with character timing",
  enable_logging: false
});

// Process streaming audio chunks
for await (const chunk of timestampStream) {
  console.log(`Audio chunk: ${chunk.audio}`);
  console.log(`Character range: ${chunk.start_char_idx}-${chunk.end_char_idx}`);
}

Complete Streaming Documentation

Audio Generation History

Track and manage your audio generation history with comprehensive metadata.

// Get generation history
const historyResponse = await client.history.getAll({
  page_size: 100,
  start_after_history_item_id: "last_item_id"
});

// Download specific audio
const historyAudio = await client.history.getAudio("history_item_id");

Complete History Management Documentation

Conversational AI Agents

Build intelligent voice agents with SIP integration and knowledge base support.

// Create conversational agent
const agent = await client.conversationalAi.agents.createAgent({
  name: "Customer Support Agent",
  voice_id: voiceId,
  language: "en",
  system_prompt: "You are a helpful customer support agent."
});

// Manage knowledge base
const knowledgeBase = await client.conversationalAi.knowledgeBase.createKnowledgeBase({
  name: "Support Knowledge Base",
  description: "Customer support documentation"
});

Complete Conversational AI Documentation

Advanced Audio Processing

Professional audio processing including dubbing, isolation, and sound effects generation.

// Create dubbing project
const dubbingProject = await client.dubbing.createDubbing({
  name: "Video Dubbing Project",
  source_url: "https://example.com/video.mp4",
  target_lang: "es"
});

// Audio isolation
const isolatedAudio = await client.audioIsolation.isolate({
  audio: audioFile
});

// Generate sound effects
const soundEffect = await client.textToSoundEffects.generate({
  text: "Rain falling on leaves",
  duration_seconds: 10
});

Complete Audio Processing Documentation

Studio Project Management

Manage complex audio projects with chapters, collaboration, and professional workflows.

// Create studio project
const project = await client.studio.projects.createProject({
  name: "Audiobook Project",
  default_title_voice_id: voiceId,
  default_paragraph_voice_id: voiceId
});

// Add chapters
const chapter = await client.studio.chapters.createChapter(projectId, {
  name: "Chapter 1: Introduction",
  text: "Chapter content here..."
});

Complete Projects & Studio Documentation

Error Handling

import { ElevenLabsError, ElevenLabsTimeoutError } from 'elevenlabs';

try {
  const audio = await client.textToSpeech.convert(voiceId, request);
  await play(audio);
} catch (error) {
  if (error instanceof ElevenLabsTimeoutError) {
    console.error('Request timed out:', error.message);
  } else if (error instanceof ElevenLabsError) {
    console.error('API Error:', error.statusCode, error.body);
  } else {
    console.error('Unexpected error:', error);
  }
}

Utility Functions

The SDK includes helpful utility functions for audio playback and streaming.

import { play, stream } from 'elevenlabs';

// Play audio using ffplay (requires ffmpeg installation)
await play(audioStream);

// Stream audio using mpv (requires mpv installation, Node.js only)
await stream(audioStream);

Complete Utilities Documentation

Platform Support

  • Node.js: Full feature support including file operations and system utilities
  • Browser: Core API functionality (audio playback utilities require Node.js)
  • TypeScript: Complete type definitions included
  • CommonJS & ESM: Dual package support

Authentication

API authentication uses the xi-api-key header with your ElevenLabs API key:

// Environment variable (recommended)
export ELEVENLABS_API_KEY="your-api-key"

// Or pass directly to client
const client = new ElevenLabsClient({
  apiKey: "your-api-key"
});

// Or override per request
const audio = await client.textToSpeech.convert(voiceId, request, {
  apiKey: "different-api-key"
});

Rate Limits and Best Practices

  • The SDK includes automatic retry logic with exponential backoff
  • Configure timeouts and retry limits based on your application needs
  • Use streaming for real-time applications to reduce latency
  • Enable zero retention mode (enable_logging: false) for privacy-sensitive applications
  • Consider voice caching and reuse for improved performance

Next Steps

Choose the documentation section that matches your use case: