Storage and Search
We built core infrastructure so you don’t have to
Soniox Storage and Search enables you to store, index, retrieve and search over your audio data. With just one API call, you can transcribe, store and index audio as your application requires. Then you can immediately retrieve and search over your data in numerous ways.
It is the fastest and easiest way to integrate audio storage and search into your application.
Start nowPowerful building blocks for storing and searching audio
Soniox Storage and Search APIs enable you to:
- Store audio
- Store transcript
- Store associated metadata with audio/transcript
- Retrieve audio/subsegments in streaming mode
- Retrieve transcript in structured format
- Search audio/transcript by its metadata
- Search audio/transcript by its transcript content
Fastest and easiest integration
With just one API call, you can transcribe, store, index and make your audio and transcripts searchable. You only need to set the StorageConfig parameter to configure how your data should be stored and organized.
Then use our other APIs to search and retrieve audio or transcripts as your application requires.
# Configure storage.
storage_config = StorageConfig(
object_id="my_id_for_audio",
metadata={"views": "100", "likes": "100"},
title="My title for audio",
)
# Pass storage config to transcribe API call.
transcribe_file_short(
"../test_data/test_audio.flac",
client,
storage_config=storage_config
)
Control what to store and index
Soniox Storage and Search enables you to configure what information to store and index for each audio/transcript individually. You can configure:
- Whether to store audio, transcript or both
- Associated metadata in the form of key-value pairs
- Associated title
- Associated datetime
The specified configurations are used to properly store and index the data to support retrieve and search functionality.
Find relevant audio quickly
Our search API supports searching over the audio/transcript by its id, metadata, datetime, title and transcript content. This enables you to easily and quickly find the audio that you care about.
Example:
Find me top 20 phone calls that were transcribed after 2023-01-31 from company Nike and agent Mike Diaz, and the call was about Jordan shoes.
search_response = search_objects(
client,
num=20,
datetime_from=datetime.fromisoformat(
'2023-01-31T00:00+00:00'
),
metadata_query='company="Nike" AND agent="Mike Diaz"'
text_query="Jordan shoes",
)
Google search-like experience for your audio
You can also search over your audio/transcript data in Soniox Console with the Soniox Search tool. It is a private Google search-like experience over your audio data. In addition, it also supports searching by id, metadata and datetime.
Playback audio file or subsegments
Stored audio can be retrieved and playback in a streaming mode to support audio player-like functionality. There are different options to specify the start and end of an audio subsegment. We also support multiple audio formats (e.g. wav, PCM). All of this is provided through our APIs.
We also built Soniox Interactive Transcript tool that enables you to quickly read the transcript, playback the audio and jump to any word within the transcript and listen from there.
Retrieve and search audio in real-time
You can search and retrieve your data immediately after the transcribe request completes. This enables you to build near real-time applications that require access to your audio and transcript data.
Start nowDelete data at anytime
You can delete audio and transcript data at any point via a single API call. At this point, the audio, the transcript, the associated data and any search indexing information will be deleted entirely. You will not be able to retrieve or search for the deleted audio or transcript any longer.
Explore docsSecurity and compliance at the core
Soniox’s platform meets one of the highest certification standards to help reduce compliance burdens for your business and keep your data private and secure.
Compliances
Soniox is SOC 2 Type 2, GDPR and HIPAA certified. We maintain and meet the requirements for these data privacy compliance frameworks.
Encryption
All communication with Soniox Cloud is encrypted with TLS to prevent malicious access to your private data.
Isolated infrastructure
All stored data is privately and securely stored in Soniox Cloud in an isolated namespace for your Soniox account.
Ready to get started?
Explore Soniox Docs or create an account and start building your audio AI application. You can also contact us to design a custom package for your business.
Always know what you pay
Pay only for what you use. Integrated per-usage pricing with no hidden fees.