Custom Content Moderation
Custom content moderation enables you to define any words or phrases to moderate content in real-time and low-latency or asynchronously.
The defined words and phrases will be masked except for the first letter of each word (e.g. "I h*** w**** g***.").
The original words can still be retrieved using the Word.orig_text
field.
Example
In the example below, we defined custom words and phrases to be moderated when transcribing a short file.
from soniox.transcribe_file import transcribe_file_short
from soniox.speech_service import SpeechClient, set_api_key
set_api_key("<YOUR-API-KEY>")
def main():
with SpeechClient() as client:
result = transcribe_file_short(
"../test_data/test_audio.flac",
client,
content_moderation_phrases=["two years", "homesick"],
)
print("Words: " + " ".join(w.text for w in result.words))
if __name__ == "__main__":
main()
Run
python3 content_moderation_phrases.py
Output
He was t** y**** out from the east and had not yet forgotten to be h******* at times
const { SpeechClient } = require("@soniox/soniox-node");
// Do not forget to set your Soniox API key.
const speechClient = new SpeechClient();
(async function () {
const result = await speechClient.transcribeFileShort(
"../test_data/test_audio.flac",
{
content_moderation_phrases: ["two years", "homesick"],
}
);
console.log(`Words: ${result.words.map((word) => word.text).join(" ")}`);
})();
Run
node content_moderation_phrases.js
Output
He was t** y**** out from the east and had not yet forgotten to be h******* at times