Manage Vocabularies (SpeechContexts)
This is an advanced guide on how to use and manage vocabularies. Make sure you have first read the Custom Vocabulary guide.
Storing and Reusing SpeechContext
Instead of creating a SpeechContext
(vocabulary) every time from scratch, you can create a SpeechContext
once and refer to it by name when creating requests in the future.
speech_context = SpeechContext(name="my_context")
result = transcribe_file_short(
TEST_AUDIO_FLAC, client, speech_context=speech_context
)
const speech_context = { name: "my_context" };
const response = await speechClient.transcribeFileShort(
TEST_AUDIO_FLAC,
{ speech_context: speech_context}
);
See the gRPC API reference for creating, updating, deleting and listing SpeechContext
here.
Multiple SpeechContexts
If you have multiple customers, each having their own domain specific words/phrases, then you can create multiple SpeechContexts
, one for each of your customers.
When transcribing, simply use the customer specific SpeechContext
to transcribe that customer's audio.
This should result in high recognition accuracy of your customer's specific terminology.
Management Tool
We implement a simple management tool in Python to create, delete, update, list and get SpeechContext
objects.
See the examples on the usage below.
Create SpeechContext
python3 -m soniox.manage_speech_contexts --create --name my_context \
--phrases "acetyllcarnitine; zestoretic" --boost 10
Delete SpeechContext
python3 -m soniox.manage_speech_contexts --delete --name my_context
Update SpeechContext
python3 -m soniox.manage_speech_contexts --update --name my_context \
--phrases "acetyllcarnitine; zestoretic; prednisone" --boost 15
List SpeechContexts
python3 -m soniox.manage_speech_contexts --list
Listing names of speech contexts.
my_context
Get SpeechContext
python3 -m soniox.manage_speech_contexts --get --name my_context
Getting speech context "my_context".
Speech context:
{
"entries": [
{
"phrases": [
"acetyllcarnitine",
"zestoretic"
],
"boost": 10.0
}
],
"name": "my_context"
}