OpenAI has released a higher version of its popular chatbot, ChatGPT-4, which comes with better capability to pass exams.
According to the company, ChatGPT-4 passed a simulated bar exam with a score around the top 10% of test takers; in contrast to GPT-3.5’s score, which was around the bottom 10%.
Describing the AI tool as the latest milestone in its efforts at scaling up deep learning, OpenAI said GPT-4 is a large multimodal model, accepting image and text inputs, and emitting text outputs. It noted that while AI is less capable than humans in many real-world scenarios, it exhibits human-level performance on various professional and academic benchmarks.
OpenAi said it spent 6 months iteratively aligning GPT-4 using lessons from its adversarial testing program as well as ChatGPT, which yielded “best-ever results” on factuality, steerability, and refusing to go outside of guardrails.
GPT-4 vs GPT-3.5, what changed: Like previous GPT models, GPT-4 was trained using publicly available data, including from public webpages, as well as data that OpenAI licensed. OpenAI worked with Microsoft to develop a “supercomputer” from the ground up in the Azure cloud, which was used to train GPT-4. Highlighting the difference between version 4 and version 3.5 of GPT: OpenAI said:
- “In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold — GPT-4 is more reliable, creative and able to handle much more nuanced instructions than GPT-3.5.”
The GPT-4’s distinction is also in its ability to understand images as well as text. GPT-4 can caption — and even interpret — relatively complex images. GPT-4 can generate text and accept image and text inputs — an improvement over GPT-3.5, its predecessor, which only accepted text — and performs at “human level” on various professional and academic benchmarks.
- “For example, if a user sends a picture of the inside of their refrigerator, the Virtual Volunteer will not only be able to correctly identify what’s in it but also extrapolate and analyze what can be prepared with those ingredients. The tool can also then offer a number of recipes for those ingredients and send a step-by-step guide on how to make them,” OpenAI said in a blog post explaining the image understanding capability of the tool.
Despite its capabilities, reminded users that the OpenAI GPT-4, like other models, lacks knowledge of events that have occurred after the vast majority of its data cut off (September 2021), and does not learn from its experience.