Use when the user wants to design, size, audit, or choose a self-hosted speech recognition or streaming ASR stack, including Whisper, Parakeet, Canary, Riva, NIM, Triton ASR, faster-whisper, sherpa-onnx, voice-agent transcription, Romanian or Moldovan ASR, contact-center transcription, GPU sizing, latency budgets, multilingual routing, VAD, diarization, or production evaluation.
100
100%
Does it follow best practices?
Impact
100%
2.00xAverage score across 3 eval scenarios
Passed
No known issues
Architect, size, and audit self-hosted speech recognition and streaming ASR systems for production workloads.
This Tessl skill creates architecture blueprints for local or on-prem ASR systems. It covers language and model routing, streaming architecture, serving topology, multi-GPU scaling, hardware sizing, runtime and quantization choices, layered processing, audio I/O, evaluation plans, and rollout risk.
It is especially detailed for Romanian, Moldovan, and Romanian-English streaming ASR, where model language coverage, licensing, and latency tradeoffs are easy to get wrong.
Use this skill for prompts such as:
tessl install sharaf/speech-recognition-architect| Layer | Where | Purpose |
|---|---|---|
| Entry point | SKILL.md | Required inputs, decision triage, output contract, guardrails |
| Model selection | references/model-selection.md | Inputs, model routing, licenses, RO/EN handling |
| Streaming architecture | references/streaming-architecture.md | Latency bands, architecture families, chunking, commit policy |
| Serving and scaling | references/serving-and-scaling.md | Triton, Riva, NIM, OSS stacks, stream affinity, autoscaling |
| Hardware sizing | references/hardware-sizing.md | GPU math, per-stream memory, cost and capacity formulas |
| Runtime and evaluation | references/runtime-processing-evaluation.md | Quantization, layered processing, audio I/O, metrics |
| Output and guardrails | references/blueprint-output-and-guardrails.md | Required blueprint structure and must-not-do rules |
| Full migrated source | references/streaming-asr-architect-source.md | Complete legacy skill text and source research index |
The entrypoint is intentionally short. Open the focused bundled references only when the task needs deeper tables or decision detail.
For full designs, the skill produces:
# ASR Stack Blueprint - <project name>
## Requirements Recap
## Model Selection
## Streaming Architecture
## Serving Topology
## Hardware Sizing
## Quantization & Runtime
## Layered Processing
## Audio I/O & Protocols
## Evaluation Plan
## Rollout & Risks
## Open QuestionsInitial Sonnet eval on 3 generated ASR architecture scenarios:
| Metric | Result |
|---|---|
| Activation | 3/3 scenarios |
| Baseline average | 51% |
| With-context average | 100% |
| Uplift | 1.96x |
Scenario coverage:
| Scenario | Baseline | With context |
|---|---|---|
| Romanian contact-center real-time ASR blueprint | 32% | 100% |
| High-concurrency Whisper serving architecture | 63% | 100% |
| RO+EN code-switching with English translation | 58% | 100% |