Custom Content Moderation

Custom content moderation enables you to define any words or phrases to moderate content in real-time and low-latency or asynchronously. The defined words and phrases will be masked except for the first letter of each word (e.g. "I h*** w**** g***."). The original words can still be retrieved using the Word.orig_text field.

Example

In the example below, we defined custom words and phrases to be moderated when transcribing a short file.

content_moderation_phrases.py

from soniox.transcribe_file import transcribe_file_short
from soniox.speech_service import SpeechClient, set_api_key

set_api_key("<YOUR-API-KEY>")


def main():
    with SpeechClient() as client:
        result = transcribe_file_short(
            "../test_data/test_audio.flac",
            client,
            content_moderation_phrases=["two years", "homesick"],
        )
        print("Words: " + " ".join(w.text for w in result.words))


if __name__ == "__main__":
    main()    

Run

python3 content_moderation_phrases.py

Output

He was t** y**** out from the east and had not yet forgotten to be h******* at times

content_moderation_phrases.js

const { SpeechClient } = require("@soniox/soniox-node");

// Do not forget to set your Soniox API key.
const speechClient = new SpeechClient();

(async function () {
    const result = await speechClient.transcribeFileShort(
        "../test_data/test_audio.flac",
        {
            content_moderation_phrases: ["two years", "homesick"],
        }
    );

    console.log(`Words: ${result.words.map((word) => word.text).join(" ")}`);
})();

Run

node content_moderation_phrases.js

Output

He was t** y**** out from the east and had not yet forgotten to be h******* at times

cookie Change your cookie preferences