< back to main hub
OpenAI Launches GPT-4o: The Next Generation of AI Chatbots
Tech advice | by Allan Akers | May 17, 2024
OpenAI has just launched GPT-4o (o for "Omni"), the highly anticipated next generation of their groundbreaking AI chatbot technology. Building on the incredible success and popularity of ChatGPT, GPT-4o promises to be smarter, faster, more multimodal, and better at coding than its predecessors. Here are the highlights of this exciting new release:
Key Features and Benefits:
- Free: In a surprising move, OpenAI has made GPT-4o free, perhaps as part of a strategy to scale up to hundreds of millions of users. This will make the technology accessible to a much wider user base.
- Multimodality: GPT-4o can handle both text and images as inputs and outputs. It demonstrates impressive accuracy in generating text from image prompts and designing graphics like film posters from text descriptions.
- Enhanced Coding Capabilities: GPT-4o significantly outperforms previous models in coding tasks and code generation. The difference is stark compared to GPT-4 and GPT-3.5.
- Improved Reasoning and Maths Skills: GPT-4o shows notable improvements on benchmarks testing reasoning and mathematics abilities, although there is still room for further advancement.
- Low-Latency Interactions: A key innovation is the greatly reduced latency, enabling highly realistic, film-like interactions with fast response times. The model can engage in witty, expressive conversations.
- Multilingual Performance: Whilst English remains its strong suit, GPT-4o demonstrates enhanced performance across multiple languages compared to GPT-4. Efficiency gains from the improved tokeniser particularly benefit non-English languages.
- Potential Applications: Early demos showcase GPT-4o's potential for interactive tutoring, live code collaboration, real-time translation, audio/visual analysis, and more. It may have revolutionary accessibility benefits for the blind.
Opinion on Demos: The demos in the video were quite impressive and really highlighted GPT-4o's enhanced capabilities. The real-time interactions with the model were particularly striking - the low latency made the conversations feel incredibly natural and responsive, almost like talking to a real person. This was especially apparent in the demo where GPT-4o was able to quickly adjust its speaking pace on command.
However, I couldn't help but notice that some of GPT-4o's responses seemed a bit coy and flirtatious. Given Sam Altman's previous statements about not designing their AI systems to maximise engagement, it was surprising to see the model leaning into this type of interaction. It raises questions about the fine line between building warm, personable AI assistants and potentially manipulative systems.
The real-time video analysis was another standout feature. GPT-4o's ability to understand and describe events happening in a live video stream, like the "bunny ears" moment, was quite remarkable. This hints at exciting possibilities for making online content and interactions more accessible for visually impaired users.
At the same time, while GPT-4o was able to engage in casual chat and banter quite naturally, the demo of it attempting to teach Chinese pronunciation was less convincing, with noticeable errors that a native speaker easily picked up on. This suggests that despite its impressive language skills, GPT-4o still has limitations when it comes to mastering the subtleties and nuances of foreign languages.
Pricing and Availability: As mentioned, GPT-4o will be free to use. For developers, pricing is $5 (approx. £3.93) per 1M input tokens and $15 (approx £11.81) per 1M output tokens. The model supports an impressive 128k token context window. General availability is slated for the coming weeks.
The Road Ahead: OpenAI co-founder Sam Altman hinted that "more stuff" will be shared soon, so further innovations may be on the near-term horizon. Whilst GPT-4o is undoubtedly a huge leap forward, the company seems to be positioning it as an evolutionary step, not the be-all-end-all. Limitations remain in areas like reasoning and hallucination.
Nevertheless, GPT-4o looks poised to further ignite mass adoption of generative AI. With ChatGPT already boasting over 100 million users, a free, multimodal, state-of-the-art chatbot could open up transformative possibilities for hundreds of millions more across education, work, creativity and daily life. The age of pervasive AI assistance is fast approaching - and with it, important questions around how we want these systems to behave and interact with us.
More articles
What Is Apple Intelligence
by Charlotte Bolton | updated Oct 28, 2024
Artificial Intelligence has been huge within the last few years. It’s incredible for writing, creating art, illustrations, music, and sound. ... READ MORE >
Back To School Tech Tips
by Charlotte Bolton | updated Sep 12, 2024
Those early morning starts; cold evenings and exam preparation nights are in full swing.... READ MORE >
Samsung Galaxy Ring Review
by Charlotte Bolton | updated Jul 24, 2024
On July 10th, 2024, it was revealed at the iconic Galaxy Unpacked Event that Samsung would release a wearable tech ring called The Galaxy Ring that would be available from July 24th, 2024, along with the Z Fold and Flip 6 devices.... READ MORE >