TanStack AI SDK
Soniox transcription adapter for the TanStack AI SDK.
Overview
TanStack AI is a TypeScript toolkit for building AI applications. It provides a unified API that abstracts away the differences between various AI providers, allowing developers to switch models with just a few lines of code.
This package (@soniox/tanstack-ai-adapter) implements the SDK's transcription adapter, enabling you to use Soniox's Speech-to-Text models directly within the standard TanStack AI workflow.
Installation
Authentication
Set SONIOX_API_KEY in your environment or pass apiKey when creating the adapter.
Get your API key from the Soniox Console.
Example
Adapter configuration
Use createSonioxTranscription to customize the adapter instance:
Options:
apiKey: overrideSONIOX_API_KEY(required when usingcreateSonioxTranscription).baseUrl: custom API base URL. See list of regional API endpoints here. Default ishttps://api.soniox.com.headers: additional request headers.timeout: transcription timeout in milliseconds. Default is 180000ms (3 minutes).pollingIntervalMs: transcription polling interval in milliseconds. Default is 1000ms.
Transcription options
Per-request options are passed via modelOptions:
Available options:
languageHints- Array of ISO language codes to bias recognition. If you pass the TanStacklanguageoption, this adapter will merge it intolanguageHintsfor convenience.languageHintsStrict- When true, rely more heavily on language hints (note: not supported by all models)enableLanguageIdentification- Automatically detect spoken languageenableSpeakerDiarization- Identify and separate different speakerscontext- Additional context to improve accuracyclientReferenceId- Optional client-defined reference IDwebhookUrl- Webhook URL for transcription completion notificationswebhookAuthHeaderName- Webhook authentication header namewebhookAuthHeaderValue- Webhook authentication header valuetranslation- Translation configuration
For more information on the available options, see the Speech-to-Text API reference.
Accessing raw tokens
When using translation or working with multilingual audio, you may need access to raw tokens with per-token language information and translation status. The adapter attaches a non-standard providerMetadata field at runtime:
Note: When using translation, the API returns both transcription tokens (original) and translation tokens. The segments array always includes only transcription tokens. To access translation tokens, filter by translation_status === 'translation'.