OpenAI launches GPT-4o, a real-time AI model that interacts across voice, text, and vision

Nickie Louise Posted On May 13, 2024

1.0K Views

OpenAI finally delivered! After many days of speculations, OpenAI on Monday unveiled its long-awaited AI model and desktop version of ChatGPT, along with an updated version of its interface. The launch is the latest move from the company to expand the reach of its popular chatbot. And after watching the demo video, we can honestly tell you that OpenAI blows it out of the park with this one.

In a live-streamed event, OpenAI chief technology officer (CTO) Mira Murati said the new AI model update, dubbed GPT-4o (GPT 4 Zero), brings GPT-4 to everyone, including OpenAI’s free users. She also added that the new model, GPT-4o, is “much faster,” with improved capabilities and abilities to interact across text, video, and audio.

“This is the first time that we are really making a huge step forward when it comes to the ease of use,” Murati said.

With real-time capabilities, the new model also comes with improved quality and speed of ChatGPT for 50 different languages. Murati said the new model will also be available via OpenAI’s API so developers can begin building applications using the new model today. She added that GPT-4o is twice as fast as GPT-4 Turbo, 50% cheaper, and has five times higher rate limits than GPT-4 Turbo.

Two other OpenAI members, Mark Chen and Barrett Zoph, also demonstrated the audio and visual capabilities of GPT-40. Chen, an OpenAI researcher, said the chatbot has the capability to “perceive your emotion,” adding that the model can also handle users interrupting it.

During the demo, Chen asked for help calming down ahead of a public presentation. “Hey there, what’s up? How can I brighten your day today?” ChatGPT’s audio mode said when a user greeted it.

The OpenAI team also asked ChatGPT to analyze a user’s facial expression to comment on the emotions the person may be experiencing.

“Hey there, what’s up? How can I brighten your day today?” ChatGPT’s audio mode said when a user greeted it.

Chen also demonstrated that the model could tell bedtime stories and requested it to switch its voice tone to become more dramatic or robotic. He even dared it to sing the story. The fun part was when Zoph asked the model to explain the steps to solve a simple linear equation of 3x+1=4.

That’s not all, OpenAI also highlighted showed their new model could serve as a translator, even in audio mode. Chen showcased its capability by having Murati speak Italian while he spoke English, and the tool seamlessly translated their conversation into their respective languages.

To wrap up the demo, team members showed the model’s capabilities in solving math problems and assisting in coding tasks, positioning it as a formidable rival to Microsoft’s GitHub Copilot.

🚀 Want Your Story Featured?

Get in front of thousands of founders, investors, PE firms, tech executives, decision makers, and tech readers by submitting your story to TechStartups.com.

Get Featured

OpenAI launches GPT-4o, a real-time AI model that interacts across voice, text, and vision

🚀 Want Your Story Featured?

Trending Now

China’s new battery breakthrough sparks another DeepSeek moment: Lasts 25 years for a tenth of the price

DeepSeek unveils ‘Sparse Attention,’ a breakthrough next-gen AI model for faster, cheaper long-context processing

Uber to acquire Delivery Hero’s Foodpanda Taiwan business in $1.25 billion deal

Apps

Gaming

Startups

Startup Fundings

Tech News

Cryptocurrency

Cybersecurity

Emerging Technologies

Product Reviews

More...

OpenAI launches GPT-4o, a real-time AI model that interacts across voice, text, and vision