Soniox
Docs
API reference/REST API/Transcriptions

Create transcription

Creates a new transcription.

POST
/v1/transcriptions

Headers

AuthorizationRequiredBearer <SONIOX_API_KEY>

Request

application/jsonRequired
modelRequiredstring

Speech-to-text model to use for the transcription.

Maximum length: 32
audio_urlstring

URL of the audio file to transcribe. Cannot be specified if file_id is specified.

Maximum length: 4096Pattern: "^https?://[^\\s]+$"
file_idstring

ID of the uploaded file to transcribe. Cannot be specified if audio_url is specified.

Format: "uuid"
language_hintsarray<string>

Expected languages in the audio. If not specified, languages are automatically detected.

enable_speaker_diarizationboolean

When true, speakers are identified and separated in the transcription output.

contextstring

Additional context to improve transcription accuracy and formatting of specialized terms.

Maximum length: 10000
webhook_urlstring

URL to receive webhook notifications when transcription is completed or fails.

Maximum length: 256Pattern: "^https?://[^\\s]+$"
webhook_auth_header_namestring

Name of the authentication header sent with webhook notifications.

Maximum length: 256
webhook_auth_header_valuestring

Authentication header value sent with webhook notifications.

Maximum length: 256
client_reference_idstring

Optional tracking identifier string. Does not need to be unique.

Maximum length: 256

Response

201

Created transcription.

idRequiredstring

Unique identifier for the transcription request.

Format: "uuid"
statusRequiredstring

Transcription status.

Accepted values: "queued" | "processing" | "completed" | "error"
created_atRequiredstring

UTC timestamp indicating when the transcription was created.

Format: "date-time"
modelRequiredstring

Speech-to-text model used for the transcription.

audio_urlstring

URL of the file being transcribed.

file_idstring

ID of the file being transcribed.

Format: "uuid"
filenameRequiredstring

Name of the file being transcribed.

language_hintsarray<string>

Expected languages in the audio. If not specified, languages are automatically detected.

enable_speaker_diarizationRequiredboolean

When true, speakers are identified and separated in the transcription output.

contextstring

Additional context to improve transcription accuracy and formatting of specialized terms.

audio_duration_msinteger

Duration of the audio in milliseconds. Only available after processing begins.

error_messagestring

Error message if transcription failed. null for successful or in-progress transcriptions.

webhook_urlstring

URL to receive webhook notifications when transcription is completed or fails.

webhook_auth_header_namestring

Name of the authentication header sent with webhook notifications.

webhook_auth_header_valuestring

Authentication header value. Always returned masked as ******************.

webhook_status_codeinteger

HTTP status code received from your server when webhook was delivered. null if not yet sent.

client_reference_idstring

Tracking identifier string.

Errors

Created transcription.

{
  "id": "73d4357d-cad2-4338-a60d-ec6f2044f721",
  "status": "queued",
  "created_at": "2024-11-26T00:00:00Z",
  "model": "stt-async-preview",
  "audio_url": "https://soniox.com/media/examples/coffee_shop.mp3",
  "file_id": null,
  "filename": "coffee_shop.mp3",
  "language_hints": [
    "en",
    "fr"
  ],
  "context": "extra context for the transcription",
  "audio_duration_ms": 0,
  "error_message": null,
  "webhook_url": "https://example.com/webhook",
  "webhook_auth_header_name": "Authorization",
  "webhook_auth_header_value": "******************",
  "webhook_status_code": null,
  "client_reference_id": "some_internal_id"
}