OpenAI Unveils Three New Audio Models

Key Takeaways✦ Atlas AI

OpenAI launched three new audio models.

Models enhance real-time voice AI capabilities.

Pricing starts at $0.017 per minute.

Atlas AI

OpenAI introduced three new audio models for its developer platform on Thursday, May 7, aiming to enhance real-time voice-based sosourcesware agents. These models, GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper, are designed to make AI more conversational and capable of executing tasks during live interactions.

GPT-Realtime-2 manages complex requests, utilizes tools, handles interruptions, and maintains context over extended voice sessions. GPT-Realtime-Translate supports translation from over 70 languages into 13 output languages, targeting applications in customer support and education. GPT-Realtime-Whisper provides live speech-to-text functionality, enabling real-time captions, meeting notes, and workflow updates.

These models are currently available for testing in OpenAI's developer playground. Early adopters include Zillow, Priceline, and Deutsche Telekom. Pricing for GPT-Realtime-2 begins at $32 per million audio input tokens, GPT-Realtime-Translate costs $0.034 per minute, and GPT-Realtime-Whisper is priced at $0.017 per minute. This development expands OpenAI's capabilities beyond transcription and basic chat, moving towards more interactive and functional voice agents.

Atlas Forecast

Resolves May 22, 2027

Will commercial fusion plants be operational in the U.S. by 2035?

50%

Updated May 22

YES 50%NO 50%

What happens next? Vote on this forecast.

The new regulatory framework aims to accelerate timelines, but scientific hurdles remain substantial.

▸ Technology

Atlas AI

Related Articles

Spotify Targets $100M Audiobooks+ Revenue

White House AI Order Halted After Industry Pushback

OpenAI IPO Timing Discussed Amidst Public Offering Plans