General AISpeech AI

Soniox

Storage and Search

We built core infrastructure so you don’t have to

Soniox Storage and Search enables you to store, index, retrieve and search over your audio data. With just one API call, you can transcribe, store and index audio as your application requires. Then you can immediately retrieve and search over your data in numerous ways.

It is the fastest and easiest way to integrate audio storage and search into your application.

Start now

Powerful building blocks for storing and searching audio

Soniox Storage and Search APIs enable you to:

  • Store audio
  • Store transcript
  • Store associated metadata with audio/transcript
  • Retrieve audio/subsegments in streaming mode
  • Retrieve transcript in structured format
  • Search audio/transcript by its metadata
  • Search audio/transcript by its transcript content
Explore docs

Fastest and easiest integration

With just one API call, you can transcribe, store, index and make your audio and transcripts searchable. You only need to set the StorageConfig parameter to configure how your data should be stored and organized.

Then use our other APIs to search and retrieve audio or transcripts as your application requires.

Explore docs

# Configure storage.
storage_config = StorageConfig(
object_id="my_id_for_audio",
metadata={"views": "100", "likes": "100"},
title="My title for audio",
)

# Pass storage config to transcribe API call.
transcribe_file_short(
"../test_data/test_audio.flac",
client,
storage_config=storage_config
)

Control what to store and index

Soniox Storage and Search enables you to configure what information to store and index for each audio/transcript individually. You can configure:

  • Whether to store audio, transcript or both
  • Associated metadata in the form of key-value pairs
  • Associated title
  • Associated datetime

The specified configurations are used to properly store and index the data to support retrieve and search functionality.

Explore docs

Find relevant audio quickly

Our search API supports searching over the audio/transcript by its id, metadata, datetime, title and transcript content. This enables you to easily and quickly find the audio that you care about.

Example:

Find me top 20 phone calls that were transcribed after 2023-01-31 from company Nike and agent Mike Diaz, and the call was about Jordan shoes.

Explore docs

# Search for objects.
search_response = search_objects(
client,
num=20,
datetime_from=datetime.fromisoformat(
'2023-01-31T00:00+00:00'
),
metadata_query='company="Nike" AND agent="Mike Diaz"'
text_query="Jordan shoes",
)

Google search-like experience for your audio

You can also search over your audio/transcript data in Soniox Console with the Soniox Search tool. It is a private Google search-like experience over your audio data. In addition, it also supports searching by id, metadata and datetime.

Start now

Playback audio file or subsegments

Stored audio can be retrieved and playback in a streaming mode to support audio player-like functionality. There are different options to specify the start and end of an audio subsegment. We also support multiple audio formats (e.g. wav, PCM). All of this is provided through our APIs.

We also built Soniox Interactive Transcript tool that enables you to quickly read the transcript, playback the audio and jump to any word within the transcript and listen from there.

Explore docs

Retrieve and search audio in real-time

You can search and retrieve your data immediately after the transcribe request completes. This enables you to build near real-time applications that require access to your audio and transcript data.

Start now

Delete data at anytime

You can delete audio and transcript data at any point via a single API call. At this point, the audio, the transcript, the associated data and any search indexing information will be deleted entirely. You will not be able to retrieve or search for the deleted audio or transcript any longer.

Explore docs

Security and compliance at the core

Soniox’s platform meets one of the highest certification standards to help reduce compliance burdens for your business and keep your data private and secure.

Compliances

Soniox is SOC 2 Type 2, GDPR and HIPAA certified. We maintain and meet the requirements for these data privacy compliance frameworks.

Encryption

All communication with Soniox Cloud is encrypted with TLS to prevent malicious access to your private data.

Isolated infrastructure

All stored data is privately and securely stored in Soniox Cloud in an isolated namespace for your Soniox account.

Ready to get started?

Explore Soniox Docs or create an account and start building your audio AI application. You can also contact us to design a custom package for your business.

Always know what you pay

Pay only for what you use. Integrated per-usage pricing with no hidden fees.

Pricing details

Start your integration

Get up and running with Soniox in as little as 5 minutes.

API reference