Transcribe Files

In this example, we will transcribe a long audio file (> 60 seconds) using the asynchronous API. For shorter audio files (<60 seconds), please refer to the Transcribe Short Audio guide which uses the synchronous API for simplicity.

transcribe_file_async.py

1. Upload File

First we upload the file with the transcribe_file_async() function. Once the request succeeds, the auto-assigned file_id is returned.

from soniox.speech_service import SpeechClient
from soniox.transcribe_file import transcribe_file_async

with SpeechClient() as client:
    file_id = transcribe_file_async(
        "<YOUR-AUDIO-FILE>", 
        client, 
        transcribe_async_mode="instant_file",
        reference_name="test"
    )

Asynchronous transcription supports two modes: instant_file and sameday_file. With instant_file mode, the transcription result is returned within seconds to minutes upon file upload. With sameday_file mode, the transcription result is returned within 24 hours upon file uploaded. If not specified, the default is instant_file. See Soniox Pricing for more info.

You can use the reference_name to associate your internal file id with the uploaded file. It can be any string not longer than 256 characters, including the empty string, and duplicates are allowed. The service does not use this field, it is only for your reference.

2. Get Status

After the file has been uploaded, it will be in one of 4 states: QUEUED, TRANSCRIBING, COMPLETED or FAILED. We call the GetTranscribeAsyncStatus() function to check the status of the file.

status = client.GetTranscribeAsyncStatus(file_id)

If the file status is FAILED, the error message can be obtained from error_message field.

print(f"Transcription failed with error: {status.error_message}")

3. Get Result

Once the file status is COMPLETED, we call the GetTranscribeAsyncResult() function to retrieve transcription result.

result = client.GetTranscribeAsyncResult(file_id)
print("Words: " + " ".join(w.text for w in result.words))

Note, when using multi-channel audio with separate recognition per channel, the GetTranscribeAsyncResult() function returns a list of results, one for each channel.

4. Delete File

Once the transcription result has been obtained, the file should be deleted using DeleteTranscribeAsyncFile function.

client.DeleteTranscribeAsyncFile(file_id)

Example

Here is a complete code example transcribe_file_async.py on how to transcribe files with our client library.

Run

python3 transcribe_file_async.py

Output

Uploading file.
File ID: 3457
Calling GetTranscribeAsyncStatus.
Status: QUEUED
Calling GetTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling GetTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling GetTranscribeAsyncStatus.
Status: COMPLETED
Calling GetTranscribeAsyncResult
Words: But there is always a stronger sense of life when the sun is ...
Calling DeleteTranscribeAsyncFile.

transcribe_file_async.js

1. Upload File

First we upload the file with the transcribeFileAsync() function. Once the request succeeds, the auto-assigned file_id is returned.

const file_id = await speechClient.transcribeFileAsync(
    "<YOUR-AUDIO-FILE>",
    "test", // reference_name
    {
        transcribe_async_mode: "instant_file",
    }
);    

Asynchronous transcription supports two modes: instant_file and sameday_file. With instant_file mode, the transcription result is returned within seconds to minutes upon file upload. With sameday_file mode, the transcription result is returned within 24 hours upon file uploaded. If not specified, the default is instant_file. See Soniox Pricing for more info.

You can use the reference_name to associate your internal file id with the uploaded file. It can be any string not longer than 256 characters, including the empty string, and duplicates are allowed. The service does not use this field, it is only for your reference.

2. Get Status

After the file has been uploaded, it will be in one of 4 states: QUEUED, TRANSCRIBING, COMPLETED or FAILED. We call the getTranscribeAsyncStatus() function to check the status of the file.

const response = await speechClient.getTranscribeAsyncStatus(file_id);

If the file status is FAILED, the error message can be obtained from error_message field.

console.log(`Transcription failed with error: ${response.error_message}`);

3. Get Result

Once the file status is COMPLETED, we call the GetTranscribeAsyncResult() function to retrieve transcription result.

const result = await speechClient.getTranscribeAsyncResult(file_id);
console.log(`Words: ${result.words.map((word) => word.text).join(" ")}`);

Note, when using multi-channel audio with separate recognition per channel, the GetTranscribeAsyncResult() function returns a list of results, one for each channel.

4. Delete File

Once the transcription result has been obtained, the file should be deleted using DeleteTranscribeAsyncFile function.

await speechClient.deleteTranscribeAsyncFile(file_id);

Example

Here is a complete code example transcribe_file_async.js on how to transcribe files with our client library.

Run

node transcribe_file_async.js

Output

Uploading file.
File ID: 3456
Calling getTranscribeAsyncStatus.
Status: QUEUED
Calling getTranscribeAsyncStatus.
Status: QUEUED
Calling getTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling getTranscribeAsyncStatus.
Status: TRANSCRIBING
Calling getTranscribeAsyncStatus.
Status: COMPLETED
Calling getTranscribeAsyncResult
Words: But there is always a stronger sense of life when the sun is ...
Calling deleteTranscribeAsyncFile.

Limits

The maximum file size is 500MB. The maximum total duration of audio is 5 hours.

The maximum number of files that can be uploaded (or in any state) is 100.

For more information on transcribing files asynchronously, please refer to the Asynchronous Transcription section in gRPC references.

cookie Change your cookie preferences