Blog

Speech Recognition Benchmarks for German and Spanish March 2023

March 23, 2023 by Soniox Team

We conducted an extensive evaluation on word recognition accuracy of different speech recognition providers in the industry. The benchmarks are summarized as follows:

  • Providers evaluated: Soniox, Google, AWS, Azure, Rev AI, Deepgram, AssemblyAI, OpenAI Whisper, Speechmatics and NVIDIA Riva.
  • Languages evaluated: German and Spanish.
  • Processing modes evaluated: asynchronous transcription (file) and streaming transcription.
  • Evaluation datasets: real-world datasets varying in acoustic conditions, speaking styles, accents and topics.
  • Ground truth transcriptions were transcribed and double-reviewed by humans then normalized to ensure a fair evaluation across different providers.
  • Processing modes evaluated: asynchronous transcription (file) and streaming transcription.
  • Results:
    • Overall, Soniox achieved the most accurate speech recognition results in both async and streaming modes across all German and Spanish datasets. The second place belongs to Speechmatics (German) and Azure (Spanish).
    • Soniox achieved 23% higher accuracy on Spanish and 27% higher accuracy on German compared to the second place provider, i.e. about every 4th word misrecognized by Speechmatics or Azure, was correctly recognized by Soniox.
    • The lowest overall performance was obtained by Google and AWS. In the middle of the pack are the remaining providers.
  • The benchmarks were conducted with a high level of professionalism. We invested significant engineering resources to develop a benchmarking framework that tries to fairly evaluate the accuracy of different speech recognition providers.