ChatGPT is soon getting enhanced capabilities with the launch of GPT-4o, OpenAI’s latest flagship model. This new model is designed to be superior to existing ones in understanding and discussing images shared by users. According to OpenAI, GPT-4o delivers GPT-4-level intelligence at a significantly faster pace and brings improved functionalities across text, voice, and vision.
Imagine ChatGPT powered by GPT-4o: It functions similarly to Google Assistant and Apple Siri with a human-like voice but retains the robust features of ChatGPT that users rely on daily.
OpenAI has announced that GPT-4o will be rolled out to ChatGPT Plus and Team users, with Enterprise users gaining access soon. Free ChatGPT users will also have limited access to this advanced model. Plus subscribers will enjoy a message limit up to five times greater than free users, while Team and Enterprise users will have even higher limits. Interested users can download the app on Android smartphones and iPhones.
GPT-4o introduces voice and vision capabilities to ChatGPT. While ChatGPT previously only supported text-based interactions, GPT-4o enables the AI chatbot to communicate naturally through speech. For example, users can take a picture of a menu in another language and ask GPT-4o to translate it. Additionally, users can learn about the food’s history, significance, and receive recommendations.
“In the future, improvements will allow for more natural, real-time voice conversation and the ability to converse with ChatGPT via real-time video,” the company said.
During a live event, OpenAI CTO Mira Murati demonstrated the chatbot’s capabilities, using “Hey, ChatGPT..” to invoke it and asking a question in Italian, to which ChatGPT responded in English in real time.
OpenAI plans to introduce a new Voice Mode with additional features in an alpha version in the coming weeks, with early access for ChatGPT Plus users.
ChatGPT with GPT-4o offers enhanced language capabilities, both in quality and speed. It now supports over 50 languages across sign-up, login, and user settings, including Indian languages such as Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Punjabi, Tamil, and Telugu.
Murati also demonstrated that the AI chatbot can now exhibit emotions and voice modulation, perceiving emotions from live images via the phone’s camera and responding in various tones and emotions, such as dramatic or robotic voices.