Web library
How to use Soniox Speech-to-Text Web Library to transcribe microphone audio in your web application.
Transcribe audio directly in your web application
Transcribing audio in a web application is a common use case — whether you're building live captioning, searchable audio interfaces, or voice-powered tools. To make this easy, Soniox provides a lightweight Web SDK that allows you to stream audio from the browser and receive real-time transcriptions with minimal setup.
The Soniox Web Library handles:
- Capturing audio from the user's microphone
- Streaming it to the Soniox WebSocket API
- Receiving and displaying transcription results in real time
- Additional optional features like speaker diarization
The library is framework-agnostic and works with plain JavaScript, as well as modern frontend frameworks like React or Vue.
Installation
Install via your preferred package manager:
Or use the module directly from a CDN:
Starting the transcription
To transcribe microphone audio, create an instance of the RecordTranscribe
class and
call the start()
method.
Parameters
apiKey
Requiredstring | functionStatic SONIOX_API_KEY
string or async function that returns
a temporary API key.
model
RequiredstringThe transcription model to use. Example: "stt-rt-preview".
Use the GET /models
endpoint to retrieve a list of available models.
languageHints
OptionalArray<string>Hints to guide transcription toward specific languages.
See supported languages
for list of available ISO language codes.
context
OptionalstringProvide domain-specific terms or phrases to improve recognition accuracy.
Max length: 10,000 characters.
enableSpeakerDiarization
OptionalbooleanEnables automatic speaker separation.
onStarted
OptionalfunctionCalled on transcription start.
onFinished
OptionalfunctionCalled on transcription finish.
onPartialResult
OptionalfunctionCalled when partial results are received.
onFinalResult
OptionalfunctionCalled when a final result is received.
onError
OptionalfunctionCalled when an error occurs.
stream
OptionalMediaStreamProvide a custom audio stream source.
Stopping the transcription
Use stop()
for graceful exits and cancel()
for abrupt stops, e.g. on component unmount.
Using temporary API keys
You can defer API key generation until after the user initiates transcription:
Buffered audio ensures no loss during WebSocket connection setup.
Event callbacks
Callbacks can be passed to either the constructor or the start method:
View full list of supported callbacks in the Github README.
Transcribing custom audio streams
To transcribe audio from sources like an <audio>
or <video>
element:
You are responsible for managing the audio stream lifecycle.