Transcribe stream
In this example, we will transcribe a file in bidirectional streaming mode with non-final words.
In this example, we will transcribe a stream in bidirectional streaming mode. We will simulate the stream by reading a file in small chunks. This will serve as a demonstration how to transcribe any stream of data including real-time streams.
To transcribe any stream, you need to provide an iterable over successive audio chunks. In
our example, we define a generator function iter_audio
that reads audio chunks from a file.
We start transcription by calling transcribe_stream()
, which returns an iterable over
transcription results. We iterate this to obtain the results as soon as they become available.
Run
Output
Minimizing latency
When transcribing a real-time stream, the lowest latency is achieved with raw audio encoded using PCM 16-bit
little endian (pcm_s16le
) at 16 kHz sample rate. The example below shows how to transcribe such audio.
transcribe_any_stream_audio_format.py
It is possible to use other PCM formats or configurations as listed here at the cost of a small increase of latency.