November 30, 2022

Soniox Speech-to-Text Benchmarks November 2022

by Soniox Team

Soniox has conducted an extensive evaluation on word recognition accuracy of different speech-to-text providers in the industry. The benchmarks are summarized as follows:

  • Providers evaluated: Soniox, Google, AWS, Azure, Rev AI, Deepgram, AssemblyAI, OpenAI and Speechmatics.
  • Processing modes evaluated: asynchronous transcription (file) and streaming transcription.
  • Evaluation datasets: 4 different real-world datasets varying in acoustic conditions, speaking styles, accents and topics in the English language.
  • Ground truth transcriptions were transcribed and double-reviewed by humans then normalized to ensure a fair evaluation across different providers.
  • Results:
    • Overall, Soniox achieved the most accurate speech recognition results in both async and streaming modes across all 4 datasets, followed by Azure and Speechmatics. The lowest overall performance was obtained by Deepgram, AssemblyAI and Google. In the middle of the pack are Rev AI and AWS.
    • In streaming mode, Soniox leads with a wider margin compared to other providers. AssemblyAI and Deepgram had the lowest performance in the streaming mode.
  • The benchmarks were conducted with a high level of professionalism. Hundreds of hours of human time were invested to develop a benchmarking framework that tries to fairly evaluate the accuracy of different speech-to-text providers.