OpenAI has shaken the AI world with the release of GPT-4o, its newest flagship generative pre-trained transformer model. Unlike its predecessor, GPT-4, this iteration boasts the moniker “o” which stands for “omni,” signifying its expanded capabilities across modalities.
Here are the features of GPT-4o:
A Multimodal Mastermind
GPT-4o goes beyond the text-based prowess of previous models. It incorporates the ability to process and generate not just text, but also images and audio. This opens doors for a wider range of applications. Imagine a system that can create realistic images based on a simple text description or generate a musical piece based on a chosen mood.
Today, GPT-4o is much better than any existing model at understanding and discussing the images you share. For example, you can now take a picture of a menu in a different language and talk to GPT-4o to translate it, learn about the food’s history and significance, and get recommendations. In the future, improvements will allow for more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video. For example, you could show ChatGPT a live sports game and ask it to explain the rules to you. We plan to launch a new Voice Mode with these new capabilities in an alpha in the coming weeks, with early access for Plus users as we roll out more broadly.
OpenAI in a press statement
To make advanced AI more accessible and useful worldwide, GPT-4o’s language capabilities are improved across quality and speed. ChatGPT also now supports more than 50 languages(opens in a new window) across sign-up and login, user settings, and more.
Focus on Safety and User Control
OpenAI recognizes the potential risks associated with powerful AI models. They have prioritized safety by incorporating safeguards throughout the development process. This includes filtering training data to minimize bias and employing techniques to refine the model’s behavior for responsible generation. Additionally, OpenAI emphasizes user control, allowing users to tailor outputs to their specific needs and preferences.
Accessibility for All
One of the most significant aspects of GPT-4o is its accessibility. While previous models were primarily used by researchers and developers, GPT-4o is being rolled out iteratively across OpenAI’s consumer-facing products. This means everyday users will have the opportunity to interact with and leverage the power of this advanced AI.
When using GPT-4o, OpenAI says ChatGPT Free users will now have access to features such as:
- Experience GPT-4 level intelligence
- Get responses(opens in a new window) from both the model and the web
- Analyze data(opens in a new window) and create charts
- Chat about photos you take
- Upload files(opens in a new window) for assistance summarizing, writing or analyzing
- Discover and use GPTs and the GPT Store
- Build a more helpful experience with Memory
OpenAI is rolling out GPT-4o to ChatGPT Plus and Team subscribers, with availability for Enterprise users to follow shortly. Additionally, it is initiating the rollout to ChatGPT Free with usage constraints starting today. When the limit is reached, ChatGPT will automatically switch to GPT-3.5 so users can continue their conversations. ChatGPT Plus users will enjoy a message threshold up to 5 times larger than that of free users, while Team and Enterprise users will have even more generous limits.
Most media outlets had speculated that OpenAI would be launching a new AI-powered search product to rival Google, but Sam Altman clarified that the release would not include a search engine.