Transcription Advanced#

Context#

Adding context can improve transcription accuracy. Context can be beneficial to correctly transcribe uncommon spoken words, such as:

  • entity names (people, company or product names, etc.)

  • technical jargon (in medicine, engineering, science, etc.)

Context can be any text (e.g. a summary or a related document) or just a list of relevant words. It can also contain text or phrases that might not be present in the audio. Omnio will only use context if necessary.

Prompt#
Transcribe the audio. Here's the relevant context:
XiangXYZ Electronics
TitanHex Technologies
ExoBook ZX5
Diagon TekGear Z7
OmegaDrive T5
QuantumGear E9
Alice: Hey Bob, did you hear about the new laptop from XiangXYZ Electronics?
Bob: Yeah, the ExoBook ZX5, right? It’s supposed to be a contender against the Diagon TekGear Z7.

Acoustic tags#

Acoustic tags provide additional information about unspoken acoustic elements in the audio, such as background sounds, music, speech tone (screaming, whispering), etc.

Prompt#
Transcribe the audio with acoustic tag information.

Show host: Welcome back to the show! [applause] Let’s start the next round. [buzzer]

Timestamps#

Transcription with timestamps includes timestamps in [MI:SS] format.

Prompt#
Transcribe the audio with timestamps.
Alice:
[00:00] Hello. [00:00]
Bob:
[00:01] Hi, is this Alice? [00:02]
Alice:
[00:03] It is, yeah. [00:05]
Bob:
[00:05] Alice, how is it going? [00:08]
[00:09] My name is Bob. [00:11]

You can also specify how frequently you want to insert timestamps.

Prompt#
Transcribe the audio with timestamps.
Insert timestamps every 50 characters on average.
Speaker:
[01:25] No, it’s very hard on your horse, and they [01:27]
[01:27] recommend starting with duration first. So this [01:30]
[01:30] long, slow distance is you build up over the 3 [01:33]
[01:33] to 12 months to 45 to 60 minutes depending on [01:36]
[01:36] your discipline of walk, trot, canter. Your horse [01:40]
[01:40] should be able to do that and not be winded. [01:42]

Verbatim#

Verbatim transcription is a word-for-word transcription of spoken language. This means that every single word, including fillers, pauses and false starts, is transcribed exactly as it was heard.

Prompt#
Create a verbatim transcript of the audio.
Alice: He-Hello.
Bob: Hi, um, is this Alice?
Alice: It is, yeah.
Bob: Alice, how-how is it going? My name, um, is Bob.

Clean verbatim#

Clean verbatim transcription removes filler words, stammers and interjections from other speakers (e.g. “mm-hmm”, “um”).

Verbatim audio transcription
Alice: He-Hello.
Bob: Hi, um, is this Alice?
Alice: It is, yeah.
Bob: Alice, how-how is it going? My name, um, is Bob.
Prompt#
Transcribe the audio in clean verbatim form.
Alice: Hello.
Bob: Hi, is this Alice?
Alice: It is, yeah.
Bob: Alice, how is it going? My name is Bob.

Profanity#

Omnio can mask, remove or tag profanity.

Prompt#
Transcribe the audio with profanity masked.

Person: I’m done with this s***, you f****** a******.

Prompt#
Transcribe the audio with profanity removed.

Person: I’m done with this [profanity removed], you [profanity removed] [profanity removed].

Prompt#
Transcribe the audio with profanity tagged.

Person: I’m done with this [profanity:shit], you [profanity:fucking] [profanity:asshole].

Personally identifiable information#

Omnio can remove or tag personally identifiable information (PII), such as names, addresses, dates of birth, and phone numbers.

Prompt#
Transcribe the audio with personal information removed.
Interviewer: Please state your full name and address.
Interviewee: My name is [name and surname removed]. I currently live on [address removed].
Prompt#
Transcribe the audio with personal information tagged.
Interviewer: Please state your full name and address.
Interviewee: My name is [pii:Jane Doe]. I currently live on [pii:North 15th Street].