The official TypeScript library for the OpenAI API
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
The OpenAI SDK provides Node.js-specific helper functions for playing and recording audio. These utilities use ffmpeg and ffplay to handle audio streams, making it easy to work with audio from the OpenAI API.
Platform Support: Node.js only - these helpers are not available in browser environments.
openai/helpers/audioffmpeg and ffplay to be installed)import { playAudio, recordAudio } from 'openai/helpers/audio';To use these helpers, you need to have ffmpeg and ffplay installed on your system:
macOS (via Homebrew):
brew install ffmpegUbuntu/Debian:
sudo apt-get install ffmpegWindows: Download from https://ffmpeg.org/download.html
Plays audio from a stream, Response object, or File using ffplay. This is useful for immediately playing audio generated by the text-to-speech API.
/**
* Plays audio from a stream, Response, or File using ffplay
* @param input - Audio source (ReadableStream, fetch Response, or File)
* @returns Promise that resolves when playback completes
* @throws Error if not running in Node.js or if ffplay fails
*/
function playAudio(
input: NodeJS.ReadableStream | Response | File
): Promise<void>;Usage with Text-to-Speech:
import OpenAI from 'openai';
import { playAudio } from 'openai/helpers/audio';
const client = new OpenAI();
// Generate speech and play it immediately
const response = await client.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'Hello! This is a test of the text-to-speech API.',
});
// Play the audio
await playAudio(response);
console.log('Playback complete');Usage with Streaming:
import { playAudio } from 'openai/helpers/audio';
import fs from 'fs';
// Play from a file stream
const audioStream = fs.createReadStream('./audio.mp3');
await playAudio(audioStream);Usage with File Object:
import { playAudio } from 'openai/helpers/audio';
// Play from a File object
const audioFile = new File([audioBuffer], 'speech.mp3', { type: 'audio/mpeg' });
await playAudio(audioFile);Error Handling:
try {
await playAudio(audioResponse);
} catch (error) {
console.error('Playback error:', error);
// ffplay may not be installed or audio format is unsupported
}Records audio from the system's default audio input device using ffmpeg. Returns a WAV file that can be used with the transcription or translation APIs.
/**
* Records audio from the system's default input device
* @param options - Recording options
* @param options.signal - AbortSignal to cancel recording early
* @param options.device - Device index (default: 0)
* @param options.timeout - Maximum recording duration in milliseconds
* @returns Promise resolving to a File with recorded audio
* @throws Error if not running in Node.js or if ffmpeg fails
*/
function recordAudio(options?: {
signal?: AbortSignal;
device?: number;
timeout?: number;
}): Promise<File>;Basic Recording with Timeout:
import OpenAI from 'openai';
import { recordAudio } from 'openai/helpers/audio';
const client = new OpenAI();
// Record for 5 seconds
const audioFile = await recordAudio({ timeout: 5000 });
// Transcribe the recording
const transcription = await client.audio.transcriptions.create({
file: audioFile,
model: 'whisper-1',
});
console.log('Transcription:', transcription.text);Recording with Manual Abort:
import { recordAudio } from 'openai/helpers/audio';
// Create an abort controller
const controller = new AbortController();
// Start recording
const recordingPromise = recordAudio({ signal: controller.signal });
// Stop recording after user input
setTimeout(() => {
controller.abort();
console.log('Recording stopped');
}, 10000);
const audioFile = await recordingPromise;Recording from Specific Device:
// Record from device index 1 instead of default (0)
const audioFile = await recordAudio({
device: 1,
timeout: 5000,
});Complete Example - Record and Transcribe:
import OpenAI from 'openai';
import { recordAudio } from 'openai/helpers/audio';
const client = new OpenAI();
async function recordAndTranscribe() {
console.log('Recording... Speak now!');
// Record for 10 seconds
const audioFile = await recordAudio({ timeout: 10000 });
console.log('Recording complete. Transcribing...');
// Transcribe the audio
const transcription = await client.audio.transcriptions.create({
file: audioFile,
model: 'whisper-1',
language: 'en', // optional
});
console.log('You said:', transcription.text);
return transcription.text;
}
recordAndTranscribe();Recordings are captured in WAV format with the following specifications:
These settings are optimized for OpenAI's Whisper API.
The recordAudio function uses different audio providers depending on the operating system:
avfoundationdshow (DirectShow)alsa (Advanced Linux Sound Architecture)alsainterface RecordAudioOptions {
/**
* AbortSignal to stop recording before timeout
* Call controller.abort() to stop recording early
*/
signal?: AbortSignal;
/**
* Audio input device index
* @default 0 (system default device)
*/
device?: number;
/**
* Maximum recording duration in milliseconds
* Recording stops automatically after this duration
* If not specified, recording continues until manually aborted
*/
timeout?: number;
}Missing ffmpeg/ffplay:
try {
await playAudio(audioResponse);
} catch (error) {
console.error('Error:', error.message);
// "ffplay process exited with code 1"
// Ensure ffmpeg is installed: brew install ffmpeg
}Browser Environment:
import { playAudio } from 'openai/helpers/audio';
try {
await playAudio(audioResponse);
} catch (error) {
console.error(error.message);
// "Play audio is not supported in the browser yet.
// Check out https://npm.im/wavtools as an alternative."
}Recording Errors:
try {
const audio = await recordAudio({ device: 99 });
} catch (error) {
console.error('Recording error:', error);
// May indicate invalid device index or permission issues
}ffmpeg -versionimport OpenAI from 'openai';
import { recordAudio, playAudio } from 'openai/helpers/audio';
const client = new OpenAI();
async function voiceConversation() {
// 1. Record user input
console.log('Listening... (5 seconds)');
const userAudio = await recordAudio({ timeout: 5000 });
// 2. Transcribe to text
console.log('Transcribing...');
const transcription = await client.audio.transcriptions.create({
file: userAudio,
model: 'whisper-1',
});
console.log('You said:', transcription.text);
// 3. Generate response with chat
const completion = await client.chat.completions.create({
model: 'gpt-4o',
messages: [
{ role: 'user', content: transcription.text },
],
});
const responseText = completion.choices[0].message.content;
console.log('AI response:', responseText);
// 4. Convert response to speech
console.log('Generating speech...');
const speech = await client.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: responseText,
});
// 5. Play the response
console.log('Playing response...');
await playAudio(speech);
console.log('Conversation complete!');
}
voiceConversation();Install with Tessl CLI
npx tessl i tessl/npm-openai