Soniox | New Soniox SDKs

At Soniox, we're not just building a speech API, we're building the future of voice-powered applications. We believe integrating advanced speech AI should be effortless and intuitive, which is why we’re excited to introduce our new Python, Node, Web, React, and React Native SDKs, making it faster and simpler to build powerful voice experiences. The React SDK works seamlessly with Next.js, making it easy to integrate real-time speech capabilities into modern web applications.

From API to SDK

We know that working with raw API calls, while powerful, can often be a complex and time-consuming endeavor. That's why we meticulously designed our SDKs to abstract away the complexities, providing a developer-friendly interface that simplifies every aspect of integration. Think of it as the Stripe or Supabase experience for voice, a commitment to developer excellence that empowers you to focus on innovation, not implementation.

With our new SDKs, integrating Soniox's industry-leading speech and voice capabilities is as simple as importing a package and utilizing intuitive object and interface functions. In just a few lines of code, you can unlock the full potential of our platform.

Let's dive into some of the most used features.

Async transcription

Need to transcribe a single file, buffer, or URL? Our transcribe() method handles it all. This single, asynchronous call manages the entire lifecycle: upload, transcription, waiting, fetching the transcript, and even cleanup. No more juggling multiple requests or managing complex state, just a clean, efficient path from audio to accurate text.

Real-time transcription

For real-time applications, our event-driven real-time session simplifies WebSocket handling dramatically. With methods like connect(), sendAudio(), pause()/resume(), finalize(), and finish(), you have granular control over your real-time interactions. Plus, built-in keepAlive support ensures your sessions remain active and reliable.

Utilities

Beyond raw transcription, our SDKs offer valuable utilities to enrich your voice applications. Easily segment transcripts into utterances by speaker, detect language switches, and identify endpoints for cleaner, more readable output. These helpers ensure your users experience the best possible transcription display.

Unified error handling

We understand that robust error handling is crucial for any production-ready application. Our SDKs provide unified error handling across all languages, ensuring a consistent and predictable development experience. You can confidently build and deploy, knowing that potential issues will be caught and managed effectively.

Beautiful documentation

To complement our exceptional SDKs, we've crafted comprehensive and beautifully designed documentation for each language. From quickstart guides that will have you running your first real-time session or transcribing a file in minutes, to a full SDK reference, we've got you covered every step of the way.

Committed to being the most developer-friendly AI platform for speech and voice

Our goal at Soniox is clear: to be the best AI platform for speech and voice, and the most developer-friendly ecosystem for building applications powered by voice. We are committed to providing the tools and resources that empower you to innovate, experiment, and ultimately, create groundbreaking voice experiences that captivate and delight your users.

Whether you're building with Python, Node, Web, React, or React Native, our new SDKs are your gateway to unlocking the full potential of voice AI. We invite you to explore our documentation, dive into the code, and discover how Soniox can transform your development workflow.

As you explore what’s possible with Soniox, we invite you to share feedback, and join our Discord community.