
Soniox Speech-to-Text
freemiumSoniox Speech-to-Text focuses on high-accuracy, real-time speech recognition and translation across more than 60 languages. It targets developers, product teams, and enterprises that need production-ready transcription, streaming, and any-to-any speech translation in a single API. Instead of stitching together separate models for recognition, diarization, and translation, Soniox provides one universal speech API plus a companion app, aiming for native-speaker fluency, strong accent handling, and code-switching support in real conversational audio.
Visit Soniox Speech-to-TextKey Features
Pricing
- Async (file): $1.50 per 1M input audio tokens
- $3.50 per 1M input text tokens
- and $3.50 per 1M output text tokens.Real-time (streaming): $2.00 per 1M input audio tokens
- $4.00 per 1M input text tokens
- and $4.00 per 1M output text tokens.Equivalent to about $0.10 per hour for async and $0.12 per hour for real-time transcription.
- $1.50 per 1M input audio tokens
- $3.50 per 1M input text tokens
- and $3.50 per 1M output text tokens.
