Document Formatting#
Soniox Document Formatting enables you to convert a transcript into a custom formatted document when using Soniox in async mode. You can create a high-quality document output without significant post-recognition editing. In addition, it provides the ability to annotate sections of text given certain configuration parameters. Thus saving significant time and expense in the production of final documents for a variety of industries and applications.
The document formatting and annotations are configurable on-the-fly with each async transcribe request. You can specify various format configurations (e.g. dates, numbers, measurement units, spacing) which are then applied on top of the transcript to obtain the formatted text that you require. You can also specify the annotation configurations, which allows you to split and annotate the text into discrete sections using specified key phrases.
- Format Configuration: Format a transcript according to custom configuration.
- Annotation Configuration: Annotate and split the text into section depending on the configuration.
Example#
Dictated speech
Doctor Mike Green dictating orthopedic consultation Michael Brannon account number 722-719-5012.
Date of dictation is one thirty one twenty twenty three. This is a sixty eight year old male.
Chief complaint is his left hip pain. Underwent a left total hip arthroplasty by doctor Gibson in
twenty twenty. He fell this morning slipping over a rug. He was doing well previous to that. He
had immediate pain in his left hip and was brought to hospital emergency department.
Past medical history includes hypertension. His medications at home include benazepril, naproxen,
and Norco. He has no known drug allergies. Surgical history includes left total hip arthroplasty
and a shoulder arthroplasty. Examination. Physical exam of the left lower extremity patient's
dorsiflexion and extensor hallucis longus L5 out of 5 distal sensation is intact to light touch in
the deep and anesthesia period. No tibial nerve distribution distributions dorsalis pedis two plus.
Assessment this patient has left hip periprosthetic fracture. Plan, the patient desires to proceed
with surgical intervention in the form of left hip open reduction. Left hip slash femur open reduction
and internal fixation with revision arthroplasty after understanding the risks and benefits of
surgical versus nonsurgical management.
Formatted document
Doctor Mike Green dictating orthopedic consultation Michael Brannon account number 722-719-5012.
Date of dictation is 1/31/2023. This is a 68 year old male.
Chief complaint
Is his left hip pain. Underwent a left total hip arthroplasty by doctor Gibson in 2020. He fell this
morning slipping over a rug. He was doing well previous to that. He had immediate pain in his left hip
and was brought to hospital emergency department.
Past medical history
Includes hypertension. His medications at home include benazepril, naproxen, and Norco. He has no
known drug allergies. Surgical history includes left total hip arthroplasty and a shoulder arthroplasty.
Examination
Physical exam of the left lower extremity patient's dorsiflexion and extensor hallucis longus L5 out
of 5 distal sensation is intact to light touch in the deep and anesthesia. No tibial nerve distribution
distributions dorsalis pedis two plus.
Assessment
This patient has left hip periprosthetic fracture.
Plan
The patient desires to proceed with surgical intervention in the form of left hip open reduction.
Left hip/femur open reduction and internal fixation with revision arthroplasty after understanding
the risks and benefits of surgical versus nonsurgical management.
Document formmatting configuration is specified in the transcription configuration as a separate
JSON object in a string field (document_formatting_config.config_json
). This JSON object can
have two fields, format
and annotation
, containing specific configuration parameters.
The formatted document is returned from the GetTranscribeAsyncResult
API call using the Document
data structure. Refer to speech_service.proto
for the definition of this data structure.
Code Example#
import time
import json
from soniox.speech_service import SpeechClient, DocumentFormattingConfig
from soniox.transcribe_file import transcribe_file_async
def main():
with SpeechClient() as client:
docfmt_config = {
"format": {
"end_of_sentence_spacing": "2",
"numbers": "numeric",
"ordinal_numbers": "abbreviated",
"number_range": True,
"digits": True,
"DMY_date": "as_dictated",
"MY_date": "as_dictated",
"DM_date": "as_dictated",
"clock_time": True,
"time_quantity": True,
"metric_units_abbreviated": True,
"percent_symbol": True,
"height_feet_inches": "text",
"verbalized_punct": True,
},
"annotation": {
"remove_section_phrase": True,
"sections": [
{
"section_id": "ID1",
"title": "Introduction",
"phrases": [
"introduction",
"section intro",
"intro",
],
},
{
"section_id": "ID2",
"title": "Plan",
"phrases": [
"section plan",
],
},
],
},
}
file_id = transcribe_file_async(
"PATH_TO_YOUR_AUDIO_FILE",
client,
reference_name="test",
model="en_v2",
document_formatting_config=DocumentFormattingConfig(
config_json=json.dumps(docfmt_config)
),
)
while True:
status = client.GetTranscribeAsyncStatus(file_id)
if status.status in ("COMPLETED", "FAILED"):
break
time.sleep(2.0)
if status.status == "COMPLETED":
result = client.GetTranscribeAsyncResultAll(file_id)
document = result.document
print(f"Qscore: {document.qscore:.2f}")
for section in document.sections:
print(f"Section:")
print(f" Section ID: {repr(section.section_id)}")
print(f" Title: {repr(section.title)}")
print(f" Text: {repr(section.text)}")
else:
print(f"Transcription failed with error: {status.error_message}")
client.DeleteTranscribeAsyncFile(file_id)
if __name__ == "__main__":
main()
using Json.Net;
using Soniox.Types;
using Soniox.Client;
using Soniox.Client.Proto;
using var client = new SpeechClient();
var docfmtConfig = new Dictionary<string, object> {
{"format", new Dictionary<string, object> {
{"end_of_sentence_spacing", "2"},
{"numbers", "numeric"},
{"ordinal_numbers", "abbreviated"},
{"number_range", true},
{"digits", true},
{"DMY_date", "as_dictated"},
{"MY_date", "as_dictated"},
{"DM_date", "as_dictated"},
{"clock_time", true},
{"time_quantity", true},
{"metric_units_abbreviated", true},
{"percent_symbol", true},
{"height_feet_inches", "text"},
{"verbalized_punct", true},
}},
{"annotation", new Dictionary<string, object> {
{"remove_section_phrase", true},
{"sections", new List<object> {
new Dictionary<string, object> {
{"section_id", "ID1"},
{"title", "Introduction"},
{"phrases", new List<string> {
"introduction",
"section intro",
"intro",
}},
},
new Dictionary<string, object> {
{"section_id", "ID2"},
{"title", "Plan"},
{"phrases", new List<string> {
"section plan",
}},
},
}},
}},
};
var fileId = await client.TranscribeFileAsync(
"PATH_TO_YOUR_AUDIO_FILE",
new TranscriptionConfig
{
Model = "en_v2",
DocumentFormattingConfig = new DocumentFormattingConfig
{
ConfigJson = JsonNet.Serialize(docfmtConfig),
},
});
TranscribeAsyncFileStatus status;
while (true)
{
status = await client.GetTranscribeAsyncFileStatus(fileId);
if (status.Status is "COMPLETED" or "FAILED")
{
break;
}
await Task.Delay(2000);
}
if (status.Status == "COMPLETED")
{
var result = await client.GetTranscribeAsyncResultAll(fileId);
var document = result.Document;
if (document == null)
{
throw new System.Exception("No document!?");
}
Console.WriteLine($"Qscore: {document.Qscore:0.00}");
foreach (var section in document.Sections)
{
Console.WriteLine($" Section ID: {section.SectionId}");
Console.WriteLine($" Title: {section.Title}");
Console.WriteLine($" Text: {section.Text}");
}
}
else
{
Console.WriteLine($"Transcription failed with error: {status.ErrorMessage}");
}
await client.DeleteTranscribeAsyncFile(fileId);