deepgram-go-text-to-speech

Use when writing or reviewing Go code in this repo that synthesizes audio with Speak v1 REST or Speak WebSockets. Route transcription work to deepgram-go-speech-to-text, voice conversation runtime work to deepgram-go-voice-agent, and repository maintenance work to deepgram-go-maintaining-sdk.

Quality

81%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Using Deepgram Text-to-Speech from the Go SDK

Name: deepgram-go-text-to-speech
Rating: 68 (1 reviews)
Author: deepgram

When to use this product

Use this skill for pkg/client/speak work:

file or stream synthesis over REST
low-latency synthesis over WebSockets
callback-based or channel-based audio playback pipelines

Use a different skill when:

you need STT (deepgram-go-speech-to-text)
you need live voice-agent orchestration (deepgram-go-voice-agent)
you need repo workflow guidance (deepgram-go-maintaining-sdk)

Authentication

Set DEEPGRAM_API_KEY before creating Speak clients.

export DEEPGRAM_API_KEY="your_api_key"

Use the repo's env-backed client defaults instead of embedding secrets in code.

Quick start

REST synthesis to file:

package main

import (
	"context"
	"log"

	api "github.com/deepgram/deepgram-go-sdk/v3/pkg/api/speak/v1/rest"
	speak "github.com/deepgram/deepgram-go-sdk/v3/pkg/client/speak"
	interfaces "github.com/deepgram/deepgram-go-sdk/v3/pkg/client/interfaces"
)

func main() {
	if err := run(); err != nil {
		log.Fatal(err)
	}
}

func run() error {
	ctx := context.Background()

	client := speak.NewRESTWithDefaults()
	dg := api.New(client)

	if _, err := dg.ToSave(
		ctx,
		"hello.wav",
		"Hello from the Deepgram Go SDK.",
		&interfaces.SpeakOptions{Model: "aura-2-thalia-en"},
	); err != nil {
		return err
	}

	return nil
}

Streaming synthesis with callbacks or channels:

package main

import (
	"context"
	"fmt"
	"log"

	speakws "github.com/deepgram/deepgram-go-sdk/v3/pkg/api/speak/v1/websocket"
	speak "github.com/deepgram/deepgram-go-sdk/v3/pkg/client/speak"
	interfaces "github.com/deepgram/deepgram-go-sdk/v3/pkg/client/interfaces"
)

func main() {
	if err := run(); err != nil {
		log.Fatal(err)
	}
}

func run() error {
	ctx := context.Background()
	handler := speakws.NewDefaultChanHandler()

	conn, err := speak.NewWSUsingChanWithDefaults(
		ctx,
		&interfaces.WSSpeakOptions{Model: "aura-2-thalia-en"},
		handler,
	)
	if err != nil {
		return err
	}
	defer conn.Stop()

	if ok := conn.Connect(); !ok {
		return fmt.Errorf("connect failed")
	}

	conn.Start()

	if err := conn.SpeakWithText("Streaming TTS from Go."); err != nil {
		return err
	}

	// The handler receives binary audio and flow-control events.
	if err := conn.Flush(); err != nil {
		return err
	}

	return nil
}

Key parameters

interfaces.SpeakOptions
- typical fields: Model, Encoding, Container, SampleRate
interfaces.WSSpeakOptions
- typical fields: Model, streaming audio format settings
REST methods
- via pkg/api/speak/v1/rest: api.New(client).ToStream, ToFile, ToSave
WS methods
- SpeakWithText, Speak, Flush, Reset
constructors
- speak.NewRESTWithDefaults() / speak.NewREST(...)
- speak.NewWSUsingCallback...
- speak.NewWSUsingChan...

API reference (layered)

In-repo reference
- README.md
- docs.go
- pkg/client/speak/client.go
- pkg/client/speak/v1/rest/client.go
- pkg/client/speak/v1/websocket/client_callback.go
- pkg/client/speak/v1/websocket/client_channel.go
- pkg/client/interfaces/v1/types-speak.go
OpenAPI
- https://developers.deepgram.com/openapi.yaml
AsyncAPI
- https://developers.deepgram.com/asyncapi.yaml
Context7
- /llmstxt/developers_deepgram_llms_txt
Product docs
- https://developers.deepgram.com/reference/text-to-speech/speak-request
- https://developers.deepgram.com/reference/text-to-speech/speak-streaming
- https://developers.deepgram.com/docs/tts-models

Gotchas

REST and WebSocket Speak clients use different option structs.
REST methods live on pkg/api/speak/v1/rest; build them with api.New(client).
WebSocket flows use a handler passed at construction, Connect() returns bool, and shutdown is Stop().
Keep the playback or message-consumer goroutine running while audio frames arrive.
Follow the examples for file handling and output format selection instead of guessing encodings.

Example files in this repo

examples/text-to-speech/rest/file/hello-world/main.go
examples/text-to-speech/websocket/simple_channel/main.go
examples/text-to-speech/websocket/simple_callback/main.go

Central product skills

For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:

npx skills add deepgram/skills

This SDK ships language-idiomatic code skills; deepgram/skills ships cross-language product knowledge (see api, docs, recipes, examples, starters, setup-mcp).

Repository: deepgram/deepgram-go-sdk
Commit: b7c92f4

Last updated: 21 days ago
Created: 21 days ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.