Storage Configuration#

You can configure which information to store and index when performing a transcription request. In a nutshell, you can either store the audio, the transcript, or both, and you can associate additional information with each request, such as title and datetime. See the section below for detailed information.

Configuration Structure#

message StorageConfig {
    string object_id = 1;
    map<string, string> metadata = 2;
    string title = 3;
    google.protobuf.Timestamp datetime = 4;
    bool disable_store_audio = 5;
    bool disable_store_transcript = 6;
}

Specifying StorageConfig in the storage_config field of TranscriptionConfig enables Storage and Search for a transcription request. Note that specifing an empty StorageConfig enables Storage and Search with default options, and this is not the same as not specifying it or specifying null, which does not enable Storage and Search.

Object ID#

Object ID specifies a unique identifier for the data to be stored in the scope of the Soniox project. The stored data will be uniquely associated with this ID and can be retrieved by this ID. If you specify the same object ID with two different requests, the last processed version of the data will be stored under this object ID and any the other version will be discarded/deleted.

If you do not specify an object ID, one will be auto-generated. Auto-generated object IDs always start with auto:.

A user-specified object ID cannot be longer than 256 characters and can only contain the characters: A-Z a-z 0-9 - _.

Metadata#

Metadata enables you to associate additional information with each object in the form of key-value pairs, in order to later search by this metadata.

For example, if you are transcribing phone calls from a given company and agent, you can associate this information to the object (e.g. {"company": "Nike", "agent": "12345"}). This allows you to later search for and retrieved only the phone calls from a given company or from a given agent.

Title#

Title is a short string that is meant to denote the title of the object. Title is displayed in search results in the Search tool in the Soniox Console.

Datetime#

You can use the datetime field to denote when the object was created or uploaded. This enables you to search for the objects within a given time period. If you do not set the datetime, it will be automatically set to the datetime when the object was stored.

disable_store_audio / disable_store_transcript#

If disable_store_audio is set to true, then audio is not stored.

If disable_store_transcript is set to true, then transcript is not stored.

It is not permitted to set both of these fields to true, i.e. you cannot use Storage and Search but store neither transcript nor audio.

By default, these two fields are false, meaning that both audio and transcript are stored (if StorageConfig is specified).