The latest milestone in scaling up deep learning has been achieved by OpenAI through the release of their new AI model, GPT-4, which excels in understanding images and text and is considered to be a powerful addition to their offerings. It accepts image and text inputs, emitting text outputs.
Access to GPT-4 is currently limited to OpenAI’s paying customers through ChatGPT Plus, subject to a usage cap. For developers, registration for the API waitlist is available.
The cost for GPT-4 is based on tokens, which represent segments of raw text. For every 1,000 “prompt” tokens, which are the input provided to GPT-4, the pricing is $0.03. Similarly, for every 1,000 “completion” tokens, which are the output generated by GPT-4, the pricing is $0.06. It’s worth noting that tokens represent segments of words, such as “accomplishment” being divided into “ac,” “com,” “plish,” and “ment”.
Here is a breakdown of the GPT-4 pricing:
- Every 1,000 “prompt” tokens (approximately 750 words): $0.03
- Every 1,000 “completion” tokens (approximately 750 words: $0.06
Also Read: Open AI Introduces ChatGPT Paid Plan
OpenAI claims that GPT-4 has achieved significantly better results than its predecessor, GPT-3.5. For instance, in a simulated bar exam, GPT-4 scored in the top 10% of test takers, while GPT-3.5 scored in the bottom 10%. The development of GPT-4 was a six-month-long iterative process, incorporating insights from OpenAI’s adversarial testing program and ChatGPT, which led to improved performance on various measures, including factuality, steerability, and staying within defined boundaries. However, OpenAI acknowledges that there is still room for improvement, and GPT-4 is not flawless.
In preparation for expanding the availability of image input capability, OpenAI has revealed that it will be collaborating with a single partner, Be My Eyes, to begin with. Furthermore, the company has decided to open-source its OpenAI Evals framework, which automates the process of evaluating AI model performance. This will allow anyone to report shortcomings in AI models, thereby assisting in the development of improved models in the future.
What can GPT-4 do?
Here are the various capabilities of GPT-4 and what it can do:
- GPT-4 has the capability to handle more complex and nuanced tasks with greater reliability and creativity compared to GPT-3.5, which is particularly noticeable when the task at hand reaches a certain level of complexity.
- IT can process both text and image inputs simultaneously, allowing users to specify a wider range of language and vision tasks. It can generate natural language or code outputs based on inputs that include text and images from a variety of domains, and can be enhanced with text-based techniques such as few-shot and chain-of-thought prompting. However, the image input capability is currently in the research phase and not yet available to the public.
- Has “steerability”, meaning that developers and users can now prescribe the AI’s style and task by describing those directions in the “system” message. This allows for significant customization within certain bounds and offers a more personalized experience. While improvements are still being made, users are encouraged to try it out and provide feedback.
GPT-4 limitations
- GPT-4 is not fully reliable and may “hallucinate” facts and make reasoning errors.
- Careful consideration should be taken when using language model outputs, especially in high-stakes contexts.
- The protocol used with GPT-4 should match the specific use-case, which may include human review, grounding with additional context, or avoiding high-stakes uses altogether.
- While GPT-4 has significantly reduced hallucinations compared to earlier models, it still remains a real issue.
- GPT-4 scores 40% higher than GPT-3.5 on internal adversarial factuality evaluations.
It’s certainly exciting to witness the speedy evolution of AI and OpenAI is definitely taking big steps forward. As they put it themselves in conclusion, “there’s still a lot of work to do “, and the user community will play such a great role.