Web library
Use the Soniox Speech-to-Text Web Library to transcribe microphone audio in your web application.
Transcribing audio in a web application is a common use case, which is why Soniox provides a web library that allows you to easily transcribe audio in your web app. Check out the GitHub repository for source code, examples, and more.
Examples
Minimal example in a single HTML file.
Transcribe microphone audio in a Javascript app using Vite.
Transcribe microphone audio in a Next.js app.
Quick start
Install the library by running:
If you prefer not to install the package, you can use unpkg to include it directly in your HTML file:
Initializing RecordTranscribe
To start transcribing audio from a microphone, create an instance of RecordTranscribe
with an API key:
Exposing the API key to the client is not a good practice. Instead, generate a temporary API key on your backend and use it to authenticate the WebSocket connection.
Starting the transcription
To start transcribing microphone audio:
Stopping the transcription
To stop the transcription, call stop()
or cancel()
.
Calling stop()
waits until all final results are received before stopping, while cancel()
immediately stops transcription without waiting for final results.
We suggest using stop()
when a user manually stops transcription and cancel()
when an immediate stop is needed (e.g., on component unmount).
Calling stop()
ensures all final results are received before stopping, whereas cancel()
stops transcription immediately without waiting for final results.
We recommend using stop()
when a user manually stops transcription and cancel()
when an immediate stop is needed (e.g., when a component unmounts).
Buffering and temporary API keys
If you generate a temporary API key when the user clicks "Start Transcription," you may not receive the key immediately. The Soniox Speech-to-Text Web Library buffers recorded audio in memory until the WebSocket connection is established. This allows recording to start as soon as the user clicks the button without needing to generate a temporary API key in advance.
To achieve this, pass an ApiKeyGetter
function to the RecordTranscribe
constructor instead of a static API key string:
Callbacks
You can provide callbacks to handle different events during transcription.
Callbacks can be passed to either the RecordTranscribe
constructor or the start()
method.
For a list of all available callbacks with their description, check the documentation in the Github repository.
Custom audio streams
To transcribe audio from a custom source, you can pass a custom MediaStream
to the stream
option.
If you provide a custom MediaStream
to the stream
option, you are responsible for managing its lifecycle, including starting and stopping the stream. For instance, when using an HTML5 <audio>
element (as shown below), you may want to pause playback when transcription is complete or an error occurs.
Example of transcribing audio from an HTML5 <audio>
element:
Complete example
Here's a complete example of transcribing microphone audio in a single HTML file.
You can view all examples in the GitHub repository.