Profanity Filter#

Profanity filter detects and censors profane words and phrases as audio is being transcribed. When profane words and phrases are detected, all letters except the first are masked (e.g. f***). The original words can still be retrieved using the Word.orig_text field.

Profanity filter can be enabled by setting the enable_profanity_filter TranscriptionConfig to true. By default, profanity filter is disabled. Profanity filter supports both real-time and asynchronous use cases.

Example#

In this example, we transcribe a short audio file with profanity filter enabled.

profanity_filter.py

from soniox.transcribe_file import transcribe_file_short
from soniox.speech_service import SpeechClient, set_api_key


# Do not forget to set your API key in the SONIOX_API_KEY environment variable.
def main():
    with SpeechClient() as client:
        result = transcribe_file_short(
            "../test_data/test_audio_profanity.mp3",
            client,
            enable_profanity_filter=True,
        )
        print("Words: " + " ".join(w.text for w in result.words))


if __name__ == "__main__":
    main()

Run

python3 profanity_filter.py

Output

This is f****** great . No b******* whatsoever