OpenAI launched three new audio models.
Models enhance real-time voice AI capabilities.
Pricing starts at $0.017 per minute.

Atlas AI
OpenAI introduced three new audio models for its developer platform on Thursday, May 7, aiming to enhance real-time voice-based sosourcesware agents. These models, GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper, are designed to make AI more conversational and capable of executing tasks during live interactions.
GPT-Realtime-2 manages complex requests, utilizes tools, handles interruptions, and maintains context over extended voice sessions. GPT-Realtime-Translate supports translation from over 70 languages into 13 output languages, targeting applications in customer support and education. GPT-Realtime-Whisper provides live speech-to-text functionality, enabling real-time captions, meeting notes, and workflow updates.
These models are currently available for testing in OpenAI's developer playground. Early adopters include Zillow, Priceline, and Deutsche Telekom. Pricing for GPT-Realtime-2 begins at $32 per million audio input tokens, GPT-Realtime-Translate costs $0.034 per minute, and GPT-Realtime-Whisper is priced at $0.017 per minute. This development expands OpenAI's capabilities beyond transcription and basic chat, moving towards more interactive and functional voice agents.


