Profanity Filter#
Profanity filter detects and censors profane words and phrases as audio is being transcribed. When profane words and phrases
are detected, all letters except the first are masked (e.g. f***). The original words can still be retrieved using
the Word.orig_text
field.
Profanity filter can be enabled by setting the enable_profanity_filter
TranscriptionConfig
to true
.
By default, profanity filter is disabled. Profanity filter supports both real-time and asynchronous use cases.
Example#
In this example, we transcribe a short audio file with profanity filter enabled.
from soniox.transcribe_file import transcribe_file_short
from soniox.speech_service import SpeechClient, set_api_key
# Do not forget to set your API key in the SONIOX_API_KEY environment variable.
def main():
with SpeechClient() as client:
result = transcribe_file_short(
"../test_data/test_audio_profanity.mp3",
client,
enable_profanity_filter=True,
)
print("Words: " + " ".join(w.text for w in result.words))
if __name__ == "__main__":
main()
Run
python3 profanity_filter.py
Output
This is f****** great . No b******* whatsoever
const { SpeechClient } = require("@soniox/soniox-node");
// Do not forget to set your Soniox API key.
const speechClient = new SpeechClient();
(async function () {
const result = await speechClient.transcribeFileShort(
"../test_data/test_audio_profanity.mp3",
{
enable_profanity_filter: true,
}
);
console.log(`Words: ${result.words.map((word) => word.text).join(" ")}`);
})();
Run
node profanity_filter.js
Output
This is f****** great . No b******* whatsoever