CtrlK
BlogDocsLog inGet started
Tessl Logo

deepgram-prod-checklist

Execute Deepgram production deployment checklist. Use when preparing for production launch, auditing production readiness, or verifying deployment configurations. Trigger: "deepgram production", "deploy deepgram", "deepgram prod checklist", "deepgram go-live", "production ready deepgram".

80

Quality

77%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/saas-packs/deepgram-pack/skills/deepgram-prod-checklist/SKILL.md
SKILL.md
Quality
Evals
Security

Deepgram Production Checklist

Overview

Comprehensive go-live checklist for Deepgram integrations. Covers singleton client, health checks, Prometheus metrics, alert rules, error handling, and a phased go-live timeline.

Production Readiness Matrix

CategoryItemStatus
AuthProduction API key with scoped permissions[ ]
AuthKey stored in secret manager (not env file)[ ]
AuthKey rotation schedule (90-day) configured[ ]
AuthFallback key provisioned and tested[ ]
ResilienceRetry with exponential backoff on 429/5xx[ ]
ResilienceCircuit breaker for cascade failure prevention[ ]
ResilienceRequest timeout set (30s pre-recorded, 10s TTS)[ ]
ResilienceGraceful degradation when API unavailable[ ]
PerformanceSingleton client (not creating per-request)[ ]
PerformanceConcurrency limited (50-80% of plan limit)[ ]
PerformanceAudio preprocessed (16kHz mono for best results)[ ]
PerformanceLarge files use callback URL (async)[ ]
MonitoringHealth check endpoint testing Deepgram API[ ]
MonitoringPrometheus metrics: latency, error rate, usage[ ]
MonitoringAlerts: error rate >5%, latency >10s, circuit open[ ]
SecurityPII redaction enabled if handling sensitive audio[ ]
SecurityAudio URLs validated (HTTPS, no private IPs)[ ]
SecurityAudit logging on all operations[ ]

Instructions

Step 1: Production Singleton Client

import { createClient, DeepgramClient } from '@deepgram/sdk';

class ProductionDeepgram {
  private static client: DeepgramClient | null = null;

  static getClient(): DeepgramClient {
    if (!this.client) {
      const key = process.env.DEEPGRAM_API_KEY;
      if (!key) throw new Error('DEEPGRAM_API_KEY required for production');
      this.client = createClient(key);
    }
    return this.client;
  }

  // Force re-init (for key rotation)
  static reset() { this.client = null; }
}

Step 2: Health Check Endpoint

import express from 'express';
import { createClient } from '@deepgram/sdk';

const app = express();
const deepgram = createClient(process.env.DEEPGRAM_API_KEY!);

app.get('/health', async (req, res) => {
  const start = Date.now();
  try {
    // Test API connectivity by listing projects
    const { error } = await deepgram.manage.getProjects();
    const latency = Date.now() - start;

    if (error) {
      return res.status(503).json({
        status: 'unhealthy',
        deepgram: 'error',
        error: error.message,
        latency_ms: latency,
      });
    }

    res.json({
      status: 'healthy',
      deepgram: 'connected',
      latency_ms: latency,
      timestamp: new Date().toISOString(),
    });
  } catch (err: any) {
    res.status(503).json({
      status: 'unhealthy',
      deepgram: 'unreachable',
      error: err.message,
      latency_ms: Date.now() - start,
    });
  }
});

Step 3: Prometheus Metrics

import { Counter, Histogram, Gauge, Registry } from 'prom-client';

const registry = new Registry();

const transcriptionRequests = new Counter({
  name: 'deepgram_requests_total',
  help: 'Total Deepgram API requests',
  labelNames: ['method', 'model', 'status'],
  registers: [registry],
});

const transcriptionLatency = new Histogram({
  name: 'deepgram_latency_seconds',
  help: 'Deepgram API request latency',
  labelNames: ['method', 'model'],
  buckets: [0.5, 1, 2, 5, 10, 30],
  registers: [registry],
});

const audioProcessed = new Counter({
  name: 'deepgram_audio_seconds_total',
  help: 'Total audio seconds processed',
  labelNames: ['model'],
  registers: [registry],
});

const activeConnections = new Gauge({
  name: 'deepgram_active_connections',
  help: 'Active WebSocket connections',
  registers: [registry],
});

// Instrumented transcription
async function instrumentedTranscribe(url: string, model = 'nova-3') {
  const timer = transcriptionLatency.startTimer({ method: 'prerecorded', model });
  try {
    const { result, error } = await deepgram.listen.prerecorded.transcribeUrl(
      { url }, { model, smart_format: true }
    );
    timer();
    transcriptionRequests.inc({ method: 'prerecorded', model, status: error ? 'error' : 'ok' });
    if (result?.metadata?.duration) {
      audioProcessed.inc({ model }, result.metadata.duration);
    }
    if (error) throw error;
    return result;
  } catch (err) {
    timer();
    transcriptionRequests.inc({ method: 'prerecorded', model, status: 'error' });
    throw err;
  }
}

// Expose metrics endpoint
app.get('/metrics', async (req, res) => {
  res.set('Content-Type', registry.contentType);
  res.send(await registry.metrics());
});

Step 4: Alert Rules (Prometheus/AlertManager)

groups:
  - name: deepgram
    rules:
      - alert: DeepgramHighErrorRate
        expr: rate(deepgram_requests_total{status="error"}[5m]) / rate(deepgram_requests_total[5m]) > 0.05
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Deepgram error rate > 5%"

      - alert: DeepgramHighLatency
        expr: histogram_quantile(0.95, rate(deepgram_latency_seconds_bucket[5m])) > 10
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Deepgram P95 latency > 10s"

      - alert: DeepgramHealthCheckFailed
        expr: up{job="deepgram-service"} == 0
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Deepgram health check failed for 2+ minutes"

Step 5: Error Handling Wrapper

async function safeTranscribe(url: string, options: Record<string, any> = {}) {
  const timeout = options.timeout ?? 30000;

  const controller = new AbortController();
  const timeoutId = setTimeout(() => controller.abort(), timeout);

  try {
    const result = await Promise.race([
      instrumentedTranscribe(url, options.model ?? 'nova-3'),
      new Promise((_, reject) =>
        setTimeout(() => reject(new Error('Transcription timeout')), timeout)
      ),
    ]);
    clearTimeout(timeoutId);
    return result;
  } catch (err: any) {
    clearTimeout(timeoutId);
    // Log structured error
    console.error(JSON.stringify({
      level: 'error',
      service: 'deepgram',
      message: err.message,
      url: url.substring(0, 100),
      timestamp: new Date().toISOString(),
    }));
    throw err;
  }
}

Step 6: Go-Live Timeline

PhaseWhenActions
D-71 week beforeLoad test at 2x expected volume, security review
D-33 days beforeSmoke test with production key, verify all alerts fire
D-1Day beforeConfirm on-call rotation, validate dashboards
D-0LaunchShadow mode (10% traffic), monitoring open
D+1Day afterReview error rate, latency, verify no anomalies
D+71 week afterFull traffic, tune alert thresholds based on baselines

Output

  • Singleton client with reset capability
  • Health check endpoint with latency reporting
  • Prometheus metrics (requests, latency, audio, connections)
  • AlertManager rules for error rate, latency, availability
  • Timeout-safe transcription wrapper
  • Phased go-live timeline

Error Handling

IssueCauseSolution
Health check 503API key expiredRotate key, check secret manager
Metrics not scrapedWrong port/pathVerify Prometheus target config
Alert stormsThresholds too tightAdd for: duration, tune values
Timeout on large filesSync mode too slowSwitch to callback URL pattern

Resources

  • Deepgram Production Guide
  • Prometheus Best Practices
  • Deepgram SLA
Repository
jeremylongshore/claude-code-plugins-plus-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.