Soniox Speech-to-Text logo

Soniox Speech-to-Text

freemium

Soniox Speech-to-Text focuses on high-accuracy, real-time speech recognition and translation across more than 60 languages. It targets developers, product teams, and enterprises that need production-ready transcription, streaming, and any-to-any speech translation in a single API. Instead of stitching together separate models for recognition, diarization, and translation, Soniox provides one universal speech API plus a companion app, aiming for native-speaker fluency, strong accent handling, and code-switching support in real conversational audio.

Visit Soniox Speech-to-Text
Soniox Speech-to-Text screenshot

Key Features

Universal Multilingual Model
Single API for speech recognition and any-to-any translation between 60+ languages, including mixed-language utterances and dialects.
Real-Time Token-Level Streaming
Returns token-level output within milliseconds, keeping captions, voicebots, and assistants tightly in sync with live speech.
Context and Domain Adaptation
Accepts hints such as domain, topic, custom vocabulary, and reference documents to improve recognition of medical, legal, financial, or branded terminology.
Conversation Intelligence Built In
Handles automatic language detection, speaker diarization, endpointing, timestamps, and confidence scores in a single unified stream.
Privacy and Compliance Controls
Offers regional data residency (US, EU, Japan), keeps audio in memory only by default, and is SOC 2 Type II, HIPAA, and GDPR compliant.
Soniox App Companion
iOS and Android app for live transcription, translation, summaries, and insights, powered by the same universal speech AI.

Pricing

Speech-to-Text API
  • Async (file): $1.50 per 1M input audio tokens
  • $3.50 per 1M input text tokens
  • and $3.50 per 1M output text tokens.Real-time (streaming): $2.00 per 1M input audio tokens
  • $4.00 per 1M input text tokens
  • and $4.00 per 1M output text tokens.Equivalent to about $0.10 per hour for async and $0.12 per hour for real-time transcription.
1.5$ /mo
Get Started
Async (file)
  • $1.50 per 1M input audio tokens
  • $3.50 per 1M input text tokens
  • and $3.50 per 1M output text tokens.
1.5$ /mo
Get Started
Real-time (streaming)
  • $2.00 per 1M input audio tokens
  • $4.00 per 1M input text tokens
  • and $4.00 per 1M output text tokens.Equivalent to about $0.10 per hour for async and $0.12 per hour for real-time transcription.
  • Equivalent to about $0.10 per hour for async and $0.12 per hour for real-time transcription.
0.1$ /mo
Get Started
Free
  • $0.00 per month
  • includes real-time transcription and translation in 60+ languages
  • summaries and insights
  • project organization
  • online/offline recording
  • 10 free credits weekly
  • and 100 bonus credits per referral.
Pro
  • $19.99 per month
  • includes unlimited transcription
  • translation
  • summaries
  • insights
  • priority processing
  • and early access to new features.
19.99$ /mo
Get Started
Business
  • $25.00 per user per month (billed annually)
  • includes all Pro features plus multi-user team support
  • centralized management
  • shared projects
  • team-wide access
  • collaboration tools
  • region selection
  • discounts for additional members
  • and advanced admin controls.
25$ /mo
Get Started

Disclaimer: Please note that pricing information may not be up to date. For the most accurate and current pricing details, refer to the official Soniox Speech-to-Textwebsite.

Categories:

SummarizerTranslatorTranscriber
Stats
Good to know stats about this tool

Visit Soniox Speech-to-Text

API Docs
Rating
4.7
API
Available
last Update
12 days
Share: