Transcribe Files#

In this example, we will transcribe a long audio file (> 60 seconds) using the asynchronous API. For shorter audio files (<60 seconds), please refer to the Transcribe Short Audio guide which uses the synchronous API for simplicity.

transcribe_file_async.py

1. Upload File

First we upload the file with the transcribe_file_async() function. Once the request succeeds, the auto-assigned file_id is returned.

from soniox.speech_service import SpeechClient
from soniox.transcribe_file import transcribe_file_async


# Do not forget to set your API key in the SONIOX_API_KEY environment variable.
with SpeechClient() as client:
    file_id = transcribe_file_async(
        "../test_data/test_audio_long.flac",
        client,
        model="en_v2", # Do not forget to specify the model!
        reference_name="test",
    )

You can use the reference_name to associate your internal file id with the uploaded file. It can be any string not longer than 256 characters, including the empty string, and duplicates are allowed. The service does not use this field, it is only for your reference.

2. Get Status

After the file has been uploaded, it will be in one of 4 states: QUEUED, TRANSCRIBING, COMPLETED or FAILED. We call the GetTranscribeAsyncStatus() function to check the status of the file.

status = client.GetTranscribeAsyncStatus(file_id)

If the file status is FAILED, the error message can be obtained from error_message field.

print(f"Transcription failed with error: {status.error_message}")

3. Get Result

Once the file status is COMPLETED, we call the GetTranscribeAsyncResult() function to retrieve transcription result.

result = client.GetTranscribeAsyncResult(file_id)
print(f"Text: " + "".join(word.text for word in result.words))

Note, when transcribing with separate recognition per channel, the GetTranscribeAsyncResult() function returns a list of results, one for each channel.

4. Delete File

Once the transcription result has been obtained, the file should be deleted using DeleteTranscribeAsyncFile function.

client.DeleteTranscribeAsyncFile(file_id)

Run

Run the complete example transcribe_file_async.py.

python3 transcribe_file_async.py

Output

Uploading file.
File ID: 3457
Calling GetTranscribeAsyncStatus.
Status: QUEUED
Calling GetTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling GetTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling GetTranscribeAsyncStatus.
Status: COMPLETED
Calling GetTranscribeAsyncResult
Text: But there is always a stronger sense of life when the sun is ...
Calling DeleteTranscribeAsyncFile.

Limits#

The maximum file size is 500MB. The maximum total duration of audio is 5 hours.

The maximum number of uploaded files pending transcription (in QUEUED or TRANSCRIBING status) is 100. The maximum number of uploaded non-deleted files (in any status) is 2000.

For more information on transcribing files asynchronously, please refer to the Asynchronous Transcription section in gRPC references.