Audio Format

Automatic Audio-Format Detection

Soniox supports and automatically detects most common audio formats from file headers, so you don't need to manually set audio configs when using the supported file formats.

Supported File Formats

  • mp3, wav, flac, ogg, aac, aiff, amr, and asf.

  • When using supported file formats, the audio_format, sample_rate_hertz and num_audio_channels fields should not be set in the TranscriptionConfig object.

Raw Audio Samples

It is possible to send raw audio samples instead of a container format.

The supported PCM formats are: pcm_f32le, pcm_f32be, pcm_s32le, pcm_s32be, pcm_s16le, pcm_s16be. For example, pcm_f32le means float-32 little endian.

When using a PCM format, the below fields must be set

Field Type Permitted Values
audio_format string pcm_f32le, pcm_f32be, pcm_s32le, pcm_s32be, pcm_s16le, pcm_s16be
sample_rate_hertz int32 2000 to 96000 Hz
num_audio_channels int32 1 to 8

Example

This example shows how to transcribe audio encoded in PCM 16-bit little endian at 16 kHz sample rate and using 1 channel.

transcribe_any_stream_audio_format.py

for result in transcribe_stream(
        iter_audio(), 
        client, 
        audio_format="pcm_s16le",
        sample_rate_hertz=16000,
        num_audio_channels=1):

transcribe_any_stream_audio_format.js

const stream = speechClient.transcribeStream(
    { 
        audio_format: "pcm_s16le",
        sample_rate_hertz: 16000,
        num_audio_channels: 1,
        include_nonfinal: true
    },
    onDataHandler,
    onEndHandler
);
cookie Change your cookie preferences