Transcribe stream
In this example, we will transcribe a file in bidirectional streaming mode with non-final words.
In this example, we will transcribe a stream in bidirectional streaming mode. We will simulate the stream by reading a file in small chunks. This will serve as a demonstration how to transcribe any stream of data including real-time streams.
To transcribe any stream, you need to provide an iterable over successive audio chunks. In
our example, we define a generator function iter_audio
that reads audio chunks from a file.
We start transcription by calling transcribe_stream()
, which returns an iterable over
transcription results. We iterate this to obtain the results as soon as they become available.
Run
Output
To transcribe any stream, first start the stream transcription by calling speechClient.transcribeStream()
,
providing the transcription configuration and requisite user-defined callbacks. This returns an object
representing the stream (stream
). Then, call await stream.writeAsync(chunk)
for successive audio chunks
as they become available. At the end, call stream.end()
to indicate the end of audio.
Consecutive transcription results are returned by calling the onDataHandler
callback.
When the transcription has finished, the user-supplied onEndHandler
callback is called.
Any error will be indicated using the error
argument of this callback.
Run
Output
To transcribe any stream, you need to provide an async generator (IAsyncEnumerable<byte[]>
)
over successive audio chunks. In our example, we define a generator function EnumerateAudioChunks
that reads audio chunks from a file.
We start transcription by calling TranscribeStream()
, which returns an async iterable over
transcription results (IAsyncEnumerable<Result>
). We iterate this to obtain the results as
soon as they become available.
Run
Output
Minimizing latency
When transcribing a real-time stream, the lowest latency is achieved with raw audio encoded using PCM 16-bit
little endian (pcm_s16le
) at 16 kHz sample rate. The example below shows how to transcribe such audio.
transcribe_any_stream_audio_format.py
transcribe_any_stream_audio_format.js
TranscribeAnyStreamAudioFormat.cs
It is possible to use other PCM formats or configurations as listed here at the cost of a small increase of latency.