Custom Formatting

Custom formatting enables you to configure how certain words/phrases are spelled or formatted in the transcript. For example, you can configure all instances of kathy to be spelled as Cathy or twenty three and me to be formatted as 23andMe.

Custom formatting is an extension of custom vocabulary. Please read on custom vocabulary before using custom formatting.

Custom formatting is realized through SpeechContext, where the specified phrase is of the form A => B, where A are the recognized words by the system and B are the words in the output transcript. One can also specify the boost factor at the same time to boost recognition of the words in A.

Example

In this example, we configure youtube to be formatted as YouTuBe and twenty three and me as 23andMe.

customization_formatting.py

# SpeechContextEntry phrase can contain a mapping.
speech_context = SpeechContext(
    entries=[
        SpeechContextEntry(
            phrases=["youtube => YouTuBe"],
            boost=5,
        ),
        SpeechContextEntry(
            phrases=["twenty three and me => 23andMe"],
            boost=10,
        )
    ]
)

# Pass SpeechContext with transcribe request.
result = transcribe_file_short(
    "../test_data/youtube_23andme.flac", client, speech_context=speech_context
)

Run

python3 customization_formatting.py

Output

I was watching videos on YouTuBe about the company 23andMe .

customization_formatting.js

// SpeechContextEntry phrase can contain a mapping.
const speech_context = {
    entries: [
    {
        phrases: ["youtube => YouTuBe"],
        boost: 5,
    },
    {
        phrases: ["twenty three and me => 23andMe"],
        boost: 10,
    }
    ]
};

// Pass SpeechContext with transcribe request.
const result = await speechClient.transcribeFileShort(
    "../test_data/youtube_23andme.flac",
    { speech_context: speech_context }
);

Run

node customization_formatting.js

Output

I was watching videos on YouTuBe about the company 23andMe .

Constraints

In the specified mapping A => B, the following constraints are enforced:

  • A and B can each consists of at most 5 words.
  • Every word in A has to be either a fully lowercase word (e.g. mary is ok, but Mary, MARY or MaRy are not), or a fully upper case word (e.g. CNN), where each letter should be spelled out individually.
cookie Change your cookie preferences