Soniox
Docs
Integrations

Twilio

Stream Twilio call audio to Soniox Speech-to-Text API and get real-time transcriptions.

Twilio Soniox Demo

Overview

This guide demonstrates how to stream live Twilio call audio to the Soniox Speech-to-Text API and receive real-time transcription via WebSockets. If you want to see a complete example, check out this GitHub repository:

Preparation

Create a Twilio account

To get started, you'll need a Twilio account. If you don't have one, you can sign up for a free trial.

You will also need two phone numbers to test the integration:

  • One from a phone number you own, which needs to be verified by Twilio.
  • The other one is a Twilio-owned phone number that you can use for testing.

Get your Soniox API key

To use Soniox Speech-to-Text API in your application, you'll need to obtain an API key. You can get one by signing up at Soniox Console. No credit card is required for signing up and you can try out the service for free.

Running the example

Clone the repository

Clone the repository and install the dependencies:

git clone https://github.com/soniox/soniox-twilio-realtime-transcription.git
cd soniox-twilio-realtime-transcription
pip install -r requirements.txt

Configure server environment

Copy the .env.example file to .env and update the values with your Twilio account credentials and Soniox API key:

cp .env.example .env

Run the server and expose it to Twilio

Run the server:

python server.py

This will start the server and listen for incoming Twilio calls. You will specify where phone call recording is streamed later. To expose the server to Twilio, you can use ngrok.

ngrok http 5000

Note the forwarding URL that ngrok provides. It should look like https://<your-ngrok-subdomain>.ngrok.io or https://<your-ngrok-subdomain>.ngrok-free.app.

Run the client

Edit client.html and set WEBSOCKET_URL to your ngrok URL with /client at the end, e.g. wss://xxxxx.ngrok.io/client.

Open client.html in your browser to view live call transcriptions.

Start a Twilio call

You can configure Twilio calls with TwiML Bin files. More information about streaming can be found in the Twilio documentation.

Here is an example TwiML Bin file calls you phone number and streams the audio to your websocket server:

<Response>
    <Start>
        <Stream url="WEBSOCKET_URL" track="both_tracks" />
    </Start>
    <Dial>USER_PHONE_NUMBER</Dial>
    <Say voice="woman" language="en">"Hello, this is a test call. How are you?"</Say>
    <Pause length="5"/>
    <Say voice="woman" language="en">"Thank you, bye!"</Say>
</Response>

To start, we recommend using the provided call_me.py script to start a Twilio call. Simply set the following environment variables:

  • TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN (from Twilio)
  • TWILIO_PHONE_NUMBER (your Twilio number, rented on Twilio)
  • WEBSOCKET_URL with your ngrok URL with /twilio at the end, e.g. wss://xxxxx.ngrok.io/twilio.
  • USER_PHONE_NUMBER with your Twilio-verified phone number.
python call_me.py

You should hear a voice message saying "Hello, this is a test call. How are you?" and then a message saying "Thank you, bye!". Simultaneously, you should see the transcription in the browser.

On this page