Annotation Configuration#

Annotation configuration enables you to define specific sections to be recognized in the audio. The document will be automatically split into these recognized sections.

Annotation configuration is specified within the annotation part of the formatting configuration JSON object (refer to Document Formatting for an example). Below is an example of annotation configuration showing available fields.

{
    "annotation": {
        "sections": [
            {
                "section_id": "ID1",
                "title": "Introduction",
                "phrases": [
                    "introduction",
                    "section intro",
                    "intro",
                ],
            }
        ],
        "remove_section_phrase": true,
    }
}

Each section is defined using the following fields:

section_id can be used to identify the section. It is not required (may be missing/empty), but otherwise it must be unique.
title is included in the formatted document. It is not required (may be missing/empty).
phrases specifies spoken phrases used to detect the beginning of the section. It should contain words only, not punctuations.

If remove_section_phrase is set to true, then the phrase that triggered the detection of a section will not be included in the section text.

If a phrase contains parts that may be written in more than one way, such as numbers, it can be helpful to test how Soniox recognizes the phrase, for example using Transcribe Live in Soniox Console.