March 6, 2023

Soniox and OpenAI Speech Recognition Benchmarks March 2023

by Soniox Team

We conducted an extensive evaluation on word recognition accuracy of Soniox and OpenAI Whisper speech recognition AI. The benchmarks are summarized as follows:

  • Evaluation datasets: 5 different real-world datasets varying in acoustic conditions, speaking styles, accents and topics in the English language.
  • Ground truth transcriptions were transcribed and double-reviewed by humans then normalized to ensure a fair evaluation across different providers.
  • Results:
    • Soniox achieved the most accurate speech recognition results across all 5 datasets
    • Soniox was 32.61% more accurate than Whisper, meaning that on average almost every 3rd word incorrectly recognized by Whisper was correctly recognized by Soniox.
    • Whisper sometimes had a high insertion rate and recognized (hallucinated) words not spoken in the audio. Similarly, Whisper sometimes also had a high deletion rate and did not recognize words clearly spoken in the audio.
  • The benchmarks were conducted with a high level of professionalism. We invested significant engineering resources to develop a benchmarking framework that tries to fairly evaluate the accuracy of different speech recognition providers.