Audio Format#

Automatic Audio-Format Detection#

Soniox supports and automatically detects most common audio formats from file headers, so you don’t need to manually set audio configs when using the supported file formats.

Supported File Formats#

  • mp3, wav, flac, ogg, aac, aiff, amr, and asf.

  • When using supported file formats, the audio_format, sample_rate_hertz and num_audio_channels TranscriptionConfig fields should not be set.

Raw Audio Samples#

It is possible to send raw (PCM) audio samples instead of a container format. The supported formats are listed below. For example, pcm_f32le means float-32 little endian.

When using a raw format, the following TranscriptionConfig fields must be set.

Field

Type

Permitted Values

audio_format

string

pcm_f32le, pcm_f32be, pcm_s32le, pcm_s32be, pcm_s16le, pcm_s16be, mulaw, alaw

sample_rate_hertz

int32

2000 to 96000 Hz

num_audio_channels

int32

1 to 8

Example#

This example shows how to transcribe audio encoded in PCM 16-bit little endian at 16 kHz sample rate and using 1 channel.

transcribe_any_stream_audio_format.py

for result in transcribe_stream(
    iter_audio(),
    client,
    model="en_v2_lowlatency",
    include_nonfinal=True,
    audio_format="pcm_s16le",
    sample_rate_hertz=16000,
    num_audio_channels=1,
):