Official Node.js SDK for ElevenLabs text-to-speech API with voice synthesis, real-time transcription, music generation, and conversational AI
npx @tessl/cli install tessl/npm-elevenlabs--elevenlabs-js@2.30.0The official Node.js SDK for the ElevenLabs API, providing comprehensive access to text-to-speech, voice management, music generation, real-time transcription, conversational AI, and more. Built with TypeScript for full type safety and supporting multiple JavaScript runtimes.
npm install @elevenlabs/elevenlabs-jsimport {
ElevenLabsClient,
// Enhanced wrapper classes
Music, // Advanced music generation with metadata parsing
SpeechToText, // Speech-to-text with realtime WebSocket support
// Error classes
ElevenLabsError,
ElevenLabsTimeoutError,
// Environment configuration
ElevenLabsEnvironment,
// Utility functions (Node.js only)
play,
stream
} from "@elevenlabs/elevenlabs-js";Note: The WebhooksClient wrapper is automatically used when accessing client.webhooks and provides enhanced functionality including HMAC-SHA256 signature verification via constructEvent().
The SDK exports enhanced types and classes for specific functionality:
import {
// Music generation wrapper and types
Music, // Enhanced music generation client class
type SongMetadata, // Music composition metadata interface
type MultipartResponse, // Multipart music response with metadata
// Speech-to-text wrapper and real-time transcription
SpeechToText, // Enhanced STT client with WebSocket support
RealtimeConnection, // WebSocket connection manager for real-time STT
RealtimeEvents, // Real-time transcription event types
AudioFormat, // Audio format enum (PCM_16000, PCM_22050, etc.)
CommitStrategy, // Commit strategy enum (VAD, MANUAL)
type AudioOptions, // Audio configuration options for real-time STT
type UrlOptions, // URL configuration options for real-time STT
} from "@elevenlabs/elevenlabs-js";Note: Most API request/response types are available under the ElevenLabs namespace (see below). The types listed above are wrapper-specific enhancements.
Most API request/response types are available under the ElevenLabs namespace:
import { ElevenLabs } from "@elevenlabs/elevenlabs-js";
// Use types from the namespace
type Voice = ElevenLabs.Voice;
type Model = ElevenLabs.Model;
type MusicPrompt = ElevenLabs.MusicPrompt;
type TextRequest = ElevenLabs.BodyTextToSpeechFull;import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
// Initialize client
const client = new ElevenLabsClient({
apiKey: "your-api-key", // or use ELEVENLABS_API_KEY env var
});
// Convert text to speech
const audio = await client.textToSpeech.convert("voice-id", {
text: "Hello world!",
});
// Stream the audio
for await (const chunk of audio) {
// Process audio chunks
}Note: This quick reference shows top-level API methods. Many clients have nested sub-resources (e.g., client.voices.samples, client.conversationalAi.analytics.liveCount) which are fully documented in their respective capability sections below.
new ElevenLabsClient(options?: {
apiKey?: string;
environment?: string; // Production, ProductionUs, ProductionEu, ProductionIndia
baseUrl?: string;
headers?: Record<string, string | Supplier<string>>;
timeoutInSeconds?: number; // default: 240
maxRetries?: number; // default: 2
fetch?: typeof fetch;
logging?: LogConfig | ILogger;
});// Text-to-Speech
client.textToSpeech.convert(voice_id, request) → ReadableStream<Uint8Array>
client.textToSpeech.stream(voice_id, request) → ReadableStream<Uint8Array>
client.textToSpeech.convertWithTimestamps(voice_id, request) → AudioWithTimestampsResponse
client.textToSpeech.streamWithTimestamps(voice_id, request) → Stream<StreamingAudioChunkWithTimestampsResponse>
// Speech-to-Text
client.speechToText.convert(request) → SpeechToTextConvertResponse
client.speechToText.transcripts.get(transcript_id) → TranscriptResponse
client.speechToText.transcripts.delete(transcription_id) → unknown
// Real-time Transcription (Node.js only)
client.speechToText.realtime.connect(options) → Promise<RealtimeConnection>
// Speech-to-Speech
client.speechToSpeech.convert(voice_id, request) → ReadableStream<Uint8Array>
// Voice Management
client.voices.getAll(request?) → GetVoicesResponse
client.voices.search(request?) → GetVoicesV2Response
client.voices.get(voice_id, request?) → Voice
client.voices.update(voice_id, request) → EditVoiceResponseModel
client.voices.delete(voice_id) → DeleteVoiceResponseModel
client.voices.share(public_user_id, voice_id, request) → AddVoiceResponseModel
client.voices.getShared(request?) → GetLibraryVoicesResponse
client.voices.findSimilarVoices(request) → GetLibraryVoicesResponse
// Voice Settings
client.voices.settings.getDefault() → VoiceSettings
client.voices.settings.get(voice_id) → VoiceSettings
client.voices.settings.update(voice_id, request) → EditVoiceSettingsResponseModel
// Voice Cloning
client.voices.ivc.create(request) → AddVoiceIvcResponseModel
client.voices.pvc.create(request) → AddVoiceResponseModel
client.voices.pvc.update(voice_id, request?) → AddVoiceResponseModel
client.voices.pvc.train(voice_id, request?) → StartPvcVoiceTrainingResponseModel
client.voices.pvc.samples.create(voice_id, request) → VoiceSample[]
client.voices.pvc.samples.update(voice_id, sample_id, request?) → AddVoiceResponseModel
client.voices.pvc.samples.delete(voice_id, sample_id) → DeleteVoiceSampleResponseModel
// Voice Design
client.textToVoice.design(request) → VoiceDesignPreviewResponse
client.saveAVoicePreview() → void
// Sample Management
client.samples.delete(voice_id, sample_id) → DeleteSampleResponse
// Music Generation
client.music.compose(request?) → ReadableStream<Uint8Array>
client.music.composeDetailed(request?) → MultipartResponse
client.music.stream(request?) → ReadableStream<Uint8Array>
client.music.separateStems(request) → ReadableStream<Uint8Array>
client.music.compositionPlan.create(request) → MusicPrompt
// Sound Effects
client.textToSoundEffects.convert(request) → ReadableStream<Uint8Array>
// Text-to-Dialogue
client.textToDialogue.convert(request) → ReadableStream<Uint8Array>
// Audio Processing
client.audioIsolation.convert(request) → ReadableStream<Uint8Array>
// Conversational AI - Agents
client.conversationalAi.agents.create(request) → CreateAgentResponseModel
client.conversationalAi.agents.list(request?) → GetAgentsPageResponseModel
client.conversationalAi.agents.get(agent_id) → GetAgentResponseModel
client.conversationalAi.agents.update(agent_id, request) → GetAgentResponseModel
client.conversationalAi.agents.delete(agent_id) → void
client.conversationalAi.agents.duplicate(agent_id, request) → CreateAgentResponseModel
client.conversationalAi.agents.widget.get(agent_id, request?) → GetAgentEmbedResponseModel
client.conversationalAi.agents.widget.avatar.create(agent_id, request) → PostAgentAvatarResponseModel
client.conversationalAi.agents.knowledgeBase.size(agent_id) → GetAgentKnowledgebaseSizeResponseModel
client.conversationalAi.agents.link.get(agent_id) → GetAgentLinkResponseModel
client.conversationalAi.agents.llmUsage.calculate(agent_id, request?) → GetAgentLlmUsageCalculationResponseModel
// Conversational AI - Knowledge Base
client.conversationalAi.addToKnowledgeBase(request) → AddKnowledgeBaseResponseModel
client.conversationalAi.knowledgeBase.list(request?) → GetKnowledgeBaseListResponseModel
client.conversationalAi.knowledgeBase.documents.get(documentation_id, request?) → DocumentsGetResponse
client.conversationalAi.knowledgeBase.documents.delete(documentation_id, request?) → unknown
// Conversational AI - Tools
client.conversationalAi.tools.create(request) → CreateToolResponseModel
client.conversationalAi.tools.list() → GetToolsResponseModel
client.conversationalAi.tools.get(tool_id) → ToolResponseModel
client.conversationalAi.tools.update(tool_id, request) → void
client.conversationalAi.tools.delete(tool_id) → void
// Conversational AI - Conversations
client.conversationalAi.conversations.list(request?) → GetConversationsPageResponseModel
client.conversationalAi.conversations.get(conversation_id) → GetConversationResponseModel
client.conversationalAi.conversations.delete(conversation_id) → unknown
client.conversationalAi.conversations.audio.get(conversation_id) → ReadableStream<Uint8Array>
// Conversational AI - Phone
client.conversationalAi.phoneNumbers.list() → PhoneNumbersListResponseItem[]
client.conversationalAi.batchCalls.create(request) → SubmitBatchCallResponseModel
client.conversationalAi.batchCalls.list(request?) → WorkspaceBatchCallsResponse
client.conversationalAi.batchCalls.cancel(batch_id) → BatchCallResponse
client.conversationalAi.twilio.outboundCall(request) → TwilioOutboundCallResponse
client.conversationalAi.sipTrunk.outboundCall(request) → SipTrunkOutboundCallResponse
// Conversational AI - WhatsApp
client.conversationalAi.whatsappAccounts.list() → GetWhatsappAccountsResponseModel
client.conversationalAi.whatsappAccounts.import(request) → ImportWhatsAppAccountResponse
// Conversational AI - MCP Servers
client.conversationalAi.mcpServers.list() → McpServersResponseModel
client.conversationalAi.mcpServers.create(request) → McpServerResponseModel
client.conversationalAi.mcpServers.update(mcp_server_id, request?) → McpServerResponseModel
client.conversationalAi.mcpServers.delete(mcp_server_id) → unknown
// Dubbing
client.dubbing.create(request) → DoDubbingResponse
client.dubbing.get(dubbing_id) → DubbingMetadataResponse
client.dubbing.delete(dubbing_id) → void
// Studio
client.studio.createPodcast(request) → PodcastProjectResponseModel
// Audio Native
client.audioNative.create(request) → AudioNativeCreateProjectResponseModel
// Forced Alignment
client.forcedAlignment.create(request) → ForcedAlignmentResponseModel
// History
client.history.list(request?) → GetSpeechHistoryResponse
client.history.get(history_item_id) → SpeechHistoryItemResponse
client.history.delete(history_item_id) → void
client.history.download(history_item_id) → ReadableStream<Uint8Array>
// Usage
client.usage.get(request) → UsageCharactersResponseModel
// User
client.user.get() → User
client.user.subscription.get() → GetSubscriptionResponseModel
// Workspace
client.workspace.members.update(request) → UpdateWorkspaceMemberResponseModel
client.workspace.invites.create(request) → AddWorkspaceInviteResponseModel
client.workspace.invites.createBatch(request) → AddWorkspaceInviteResponseModel
client.workspace.invites.delete(invite_id) → string
client.workspace.groups.search(request) → WorkspaceGroupByNameResponseModel[]
client.workspace.groups.members.add(group_id, request) → AddWorkspaceGroupMemberResponseModel
client.workspace.groups.members.remove(group_id, request) → DeleteWorkspaceGroupMemberResponseModel
client.workspace.resources.get(resource_id, request) → ResourceMetadataResponseModel
client.workspace.resources.share(resource_id, request) → unknown
client.workspace.resources.unshare(resource_id, request) → unknown
client.workspace.resources.copyToWorkspace(resource_id, request) → unknown
// Webhooks
client.webhooks.create(request) → WorkspaceCreateWebhookResponseModel
client.webhooks.list() → WorkspaceWebhooksListResponseModel
client.webhooks.update(webhook_id, request) → WorkspaceUpdateWebhookResponseModel
client.webhooks.delete(webhook_id) → void
client.webhooks.constructEvent(rawBody, sigHeader, secret) → Promise<any>
// Pronunciation Dictionaries
client.pronunciationDictionaries.createFromRules(request) → AddPronunciationDictionaryResponseModel
client.pronunciationDictionaries.createFromFile(request) → AddPronunciationDictionaryResponseModel
client.pronunciationDictionaries.list(request?) → GetPronunciationDictionariesMetadataResponseModel
// Models
client.models.list() → Model[]
// Tokens
client.tokens.singleUse.create(token_type) → SingleUseTokenResponseModel
// Service Accounts
client.serviceAccounts.list() → WorkspaceServiceAccountListResponseModelAll audio generation methods return ReadableStream<Uint8Array> that can be iterated:
const audio = await client.textToSpeech.convert(voiceId, { text: "Hello" });
for await (const chunk of audio) {
// Process audio chunk (Uint8Array)
}import { ElevenLabs, ElevenLabsError, ElevenLabsTimeoutError } from "@elevenlabs/elevenlabs-js";
try {
const audio = await client.textToSpeech.convert(voiceId, { text: "Hello" });
} catch (error) {
if (error instanceof ElevenLabsTimeoutError) {
console.error('Request timed out');
} else if (error instanceof ElevenLabs.UnauthorizedError) {
console.error('Invalid API key');
} else if (error instanceof ElevenLabs.UnprocessableEntityError) {
console.error('Validation failed:', error.body);
} else if (error instanceof ElevenLabsError) {
console.error(`API Error ${error.statusCode}:`, error.body);
}
}All methods accept optional request-specific configuration:
const audio = await client.textToSpeech.convert(
voiceId,
{ text: "Hello" },
{
timeoutInSeconds: 300,
maxRetries: 3,
abortSignal: controller.signal,
apiKey: "override-key",
}
);Access raw HTTP responses with withRawResponse():
const { data, rawResponse } = await client.textToSpeech
.convert(voiceId, request)
.withRawResponse();
console.log('Status:', rawResponse.status);
console.log('Headers:', rawResponse.headers);const connection = await client.speechToText.realtime.connect({
apiKey: "your-api-key",
format: AudioFormat.PCM_16000,
strategy: CommitStrategy.VAD,
});
connection.on("transcript", (transcript) => {
console.log(transcript.text);
});
// Send audio data
connection.send(audioBuffer);
// Close connection
connection.close();The SDK uses a hybrid architecture combining auto-generated API clients with enhanced wrapper classes:
client.music - Automatic multipart response parsing for detailed metadataclient.speechToText - Real-time WebSocket transcription supportclient.webhooks - HMAC-SHA256 signature verificationWrapper classes are transparent - they provide the same methods as base clients plus additional features. See individual capability documentation for wrapper-specific features.
All types are available under the ElevenLabs namespace:
import { ElevenLabs } from "@elevenlabs/elevenlabs-js";
type TextRequest = ElevenLabs.BodyTextToSpeechFull;
type Voice = ElevenLabs.Voice;
type Model = ElevenLabs.Model;For detailed type definitions, see Common Types.
client.speechToText.realtime)play(), stream())Conversation, ClientTools, AudioInterface) - See Beta SDKpreviousText, nextText, previousRequestIds, nextRequestIds for better prosodyoptimizeStreamingLatency (0-4) for TTS