Twilio
Stream Twilio call audio to Soniox Speech-to-Text API and get real-time transcriptions.
Overview
This guide demonstrates how to stream live Twilio call audio to the Soniox Speech-to-Text API and receive real-time transcription via WebSockets. If you want to see a complete example, check out this GitHub repository:
Preparation
Create a Twilio account
To get started, you'll need a Twilio account. If you don't have one, you can sign up for a free trial.
You will also need two phone numbers to test the integration:
- One from a phone number you own, which needs to be verified by Twilio.
- The other one is a Twilio-owned phone number that you can use for testing.
Get your Soniox API key
To use Soniox Speech-to-Text API in your application, you'll need to obtain an API key. You can get one by signing up at Soniox Console. No credit card is required for signing up and you can try out the service for free.
Running the example
Clone the repository
Clone the repository and install the dependencies:
Configure server environment
Copy the .env.example
file to .env
and update the values with your Twilio account credentials and Soniox API key:
Run the server and expose it to Twilio
Run the server:
This will start the server and listen for incoming Twilio calls. You will specify where phone call recording is streamed later. To expose the server to Twilio, you can use ngrok.
Note the forwarding URL that ngrok provides. It should look like https://<your-ngrok-subdomain>.ngrok.io
or https://<your-ngrok-subdomain>.ngrok-free.app
.
Run the client
Edit client.html
and set WEBSOCKET_URL
to your ngrok URL with /client
at the end, e.g. wss://xxxxx.ngrok.io/client
.
Open client.html
in your browser to view live call transcriptions.
Start a Twilio call
You can configure Twilio calls with TwiML Bin
files. More information about streaming can be found in the Twilio documentation.
Here is an example TwiML Bin
file calls you phone number and streams the audio to your websocket server:
To start, we recommend using the provided call_me.py
script to start a Twilio call. Simply set the following environment variables:
TWILIO_ACCOUNT_SID
andTWILIO_AUTH_TOKEN
(from Twilio)TWILIO_PHONE_NUMBER
(your Twilio number, rented on Twilio)WEBSOCKET_URL
with your ngrok URL with/twilio
at the end, e.g.wss://xxxxx.ngrok.io/twilio
.USER_PHONE_NUMBER
with your Twilio-verified phone number.
You should hear a voice message saying "Hello, this is a test call. How are you?" and then a message saying "Thank you, bye!". Simultaneously, you should see the transcription in the browser.