Twilio
Stream Twilio call audio to Soniox Speech-to-Text API and get real-time transcriptions.
Overview
This guide demonstrates how to stream live Twilio call audio to the Soniox Speech-to-Text API and receive real-time transcription via WebSockets. If you want to see a complete example, check out this GitHub repository:
Preparation
Create a Twilio account
To get started, you'll need a Twilio account. If you don't have one, you can sign up for a free trial.
You will also need two phone numbers to test the integration:
- One from a phone number you own, which needs to be verified by Twilio.
- The other one is a Twilio-owned phone number that you can use for testing.
Get your Soniox API key
To use Soniox Speech-to-Text API in your application, you'll need to obtain an API key. You can get one by signing up at Soniox Console. No credit card is required for signing up and you can try out the service for free.
Running the example
Clone the repository
Clone the repository and install the dependencies:
Configure server environment
Copy the .env.example file to .env and update the values with your Twilio account credentials and Soniox API key:
Run the server and expose it to Twilio
Run the server:
This will start the server and listen for incoming Twilio calls. You will specify where phone call recording is streamed later. To expose the server to Twilio, you can use ngrok.
Note the forwarding URL that ngrok provides. It should look like https://<your-ngrok-subdomain>.ngrok.io or https://<your-ngrok-subdomain>.ngrok-free.app.
Run the client
Edit client.html and set WEBSOCKET_URL to your ngrok URL with /client at the end, e.g. wss://xxxxx.ngrok.io/client.
Open client.html in your browser to view live call transcriptions.
Start a Twilio call
You can configure Twilio calls with TwiML Bin files. More information about streaming can be found in the Twilio documentation.
Here is an example TwiML Bin file calls you phone number and streams the audio to your websocket server:
To start, we recommend using the provided call_me.py script to start a Twilio call. Simply set the following environment variables:
TWILIO_ACCOUNT_SIDandTWILIO_AUTH_TOKEN(from Twilio)TWILIO_PHONE_NUMBER(your Twilio number, rented on Twilio)WEBSOCKET_URLwith your ngrok URL with/twilioat the end, e.g.wss://xxxxx.ngrok.io/twilio.USER_PHONE_NUMBERwith your Twilio-verified phone number.
You should hear a voice message saying "Hello, this is a test call. How are you?" and then a message saying "Thank you, bye!". Simultaneously, you should see the transcription in the browser.