What is Deepgram?
Deepgram's voice AI platform offers a comprehensive suite of APIs designed to transform how businesses interact with voice data. The platform empowers developers with tools for speech-to-text, text-to-speech, and complete speech-to-speech voice agents.
Deepgram is engineered for unmatched accuracy, speed, and cost-effectiveness. It supports a wide range of applications, from real-time transcription and audio intelligence to creating responsive, natural-sounding voices for AI agents.
Features
- Speech-to-Text API: Unmatched accuracy, speed & cost.
- Text-to-Speech API: Responsive, natural-sounding voices.
- Audio Intelligence API: Powered by AI Language models.
- Voice Agent API: For real-time AI Agents.
- Speaker Diarization: Identifies and separates different speakers in audio.
- Smart Formatting: Improves readability of transcripts.
- Automatic Language Detection: Detects the language spoken in audio.
- Summarization: Provides concise summaries of audio transcripts.
Use Cases
- Contact Centers
- Medical Transcription
- Conversational AI
- Speech Analytics
- Media Transcription
FAQs
-
How is multichannel billed?
When you opt into using the multichannel feature, each channel is transcribed and billed separately. The total cost when using multichannel is the single-channel cost multiplied by the number of channels. -
What's the difference between Nova, Enhanced and Base models?
Nova is our newest and most powerful model, offering the best balance between accuracy and cost-effectiveness. Enhanced is a powerful ASR model that performs especially well with uncommon words. Base is our signature model, with a solid combination of accuracy and cost-effectiveness. Some languages are only supported by Enhanced and Base. -
Which file types can you transcribe?
We support over 40 audio and video formats, documented here. -
What unit of time is billed, minutes or seconds?
Deepgram bills by the second of audio. For instance, if you transcribe 61 seconds of audio, we bill you for 61 seconds of usage, not 2 minutes (120 seconds). -
Can Deepgram transcribe real-time conversations?
Yes! Our streaming API is designed for low latency and will return incremental transcripts as a speaker’s sentence unfolds.
Related Queries
Helpful for people in the following professions
Deepgram Uptime Monitor
Average Uptime
99.86%
Average Response Time
220.03 ms
Featured Tools
Join Our Newsletter
Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.