Introduction
It is important to understand the difference between ChatGPT, ChatGPT Plus and GPT-3 because they use different variants of the same underlying language model, GPT (Generative Pre-trained Transformer). While they share many similarities, they also have some important differences that can impact their cost, performance and capabilities.
Generally, bigger is better but understanding the difference between ChatGPT and GPT-3 can help you choose the right model for your specific use case, understand the model’s capabilities and limitations, and determine how to access and use the model.
Key Differences
Category | GPT-3 | ChatGPT | GPT-4 |
Size and Speed | It has a whopping 175 billion parameters making it one of the largest and most powerful AI language processing models to date. | It has 20 billion parameters. ChatGPT(gpt-35-turbo) is just a smaller and a more specific version of GPT-3. ChatGPT is not just smaller (20 billion vs. 175 billion parameters) and therefore faster than GPT-3, but it is also more accurate than GPT-3 when solving conversational tasks—a perfect business case for a lower cost/better quality AI product.
Reference | OpenAI stated that GPT-4 is “more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.”[10] They produced two versions of GPT-4, with context windows of 8,192 and 32,768 tokens, a significant improvement over GPT-3.5 and GPT-3, which were limited to 4,096 and 2,049 tokens respectively.[11]
Unlike its predecessors, GPT-4 is a multimodal model: it can take images as well as text as input;[4] this gives it the ability to describe the humor in unusual images, summarize screen-shot text, and answer exam questions that contain diagrams. |
LLM is different | ChatGPT and GPT-3 are both large language models trained by OpenAI. Both are based on transformer architecture. But ChatGPT is a finetuned version of GPT-3 and ChatGPT Plus is a finetuned version of GPT-4.
GPT-3 uses models such as :-
Reference https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/models#gpt-3-models | ChatGPT is a web app (you can access it in your browser) designed specifically for chatbot applications—and optimized for dialogue. It relies on GPT-35-turbo model to produce text, like explaining code or writing poems.
As for ChatGPT, it is not a version of GPT, but rather a specific instance of GPT-3 that has been fine-tuned on conversational tasks, such as answering questions, providing recommendations, and carrying on a dialogue. While GPT-3 is a general-purpose language model that can perform a wide range of tasks, ChatGPT has been optimized for natural language interactions with humans.
The ChatGPT model (gpt-35-turbo) is a language model designed for conversational interfaces and the model behaves differently than previous GPT-3 models. Previous models were text-in and text-out, meaning they accepted a prompt string and returned a completion to append to the prompt. However, the ChatGPT model is conversation-in and message-out. The model expects a prompt string formatted in a specific chat-like transcript format, and returns a completion that represents a model-written message in the chat. | OpenAI did not release the technical details of GPT-4; the technical report explicitly refrained from specifying the model size, architecture, or hardware used during either training or inference. While the report described that the model was trained using a combination of first supervised learning on a large dataset, then reinforcement learning using both human and AI feedback, it did not provide details of the training, including the process by which the training dataset was constructed, the computing power required, or any hyperparameters such as the learning rate, epoch count, or optimizer(s) used. The report claimed that “the competitive landscape and the safety implications of large-scale models” were factors that influenced this decision.[2] |
Purpose is different | GPT-3, or Generative Pretrained Transformer 3, is the third generation of OpenAI’s GPT language model, and it is one of the most powerful language models currently available. It can be fine-tuned for a wide range of natural language processing tasks, including language translation, text summarization, and question answering. | ChatGPT, on the other hand, is a variant of the GPT-3 model specifically designed for chatbot applications. It has been trained on a large dataset of conversational text, so it is able to generate responses that are more appropriate for use in a chatbot context. ChatGPT is also capable of inserting appropriate context-specific responses in conversations, making it more effective at maintaining a coherent conversation. | GPT-4 is first multi-modal LLM that can accept voice and video input, and can generate voice, images, and videos.
Microsoft Research tested the model behind GPT-4 and concluded that “it could reasonably be viewed as an early (yet still incomplete) version of an artificial general intelligence (AGI) system”.[
|
Content Filters | None | OpenAI has also added content filters to stop it from going off the rails. Chatgpt based on GPT3 Turbo model has issues with prompt engineering, and relatively weak safety as compared to GPT-4 model, and GPT3 Davinci model.
| GPT-4 has significantly enhanced security. In order to properly refuse harmful prompts, outputs from GPT-4 were tweaked using the model itself as a tool. A GPT-4 classifier serving as a rule-based reward model (RBRM) would take prompts, the corresponding output from the GPT-4 policy model, and a human-written set of rules to classify the output according to the rubric. GPT-4 was then rewarded for refusing to respond to harmful prompts as classified by the RBRM. |
GPT-4 | Not Applicable | OpenAI is releasing GPT-4’s text input capability via ChatGPT and the API (with a waitlist). | Offers ChatGPT Plus at a cost of $20 per month. |
Cost | The ChatGPT API (gpt-3.5-turbo) has been launched by OpenAI at a cost that is 10x cheaper than their flagship language model (text-davinci-003). |
Cost Comparison
Limitations of Chatgpt
- ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers. Fixing this issue is challenging, as: (1) during RL training, there’s currently no source of truth; (2) training the model to be more cautious causes it to decline questions that it can answer correctly; and (3) supervised training misleads the model because the ideal answer depends on what the model knows, rather than what the human demonstrator knows.
- ChatGPT is sensitive to tweaks to the input phrasing or attempting the same prompt multiple times. For example, given one phrasing of a question, the model can claim to not know the answer, but given a slight rephrase, can answer correctly.
- The model is often excessively verbose and overuses certain phrases, such as restating that it’s a language model trained by OpenAI. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive) and well-known over-optimization issues.1,2
- Ideally, the model would ask clarifying questions when the user provided an ambiguous query. Instead, our current models usually guess what the user intended.
- While we’ve made efforts to make the model refuse inappropriate requests, it will sometimes respond to harmful instructions or exhibit biased behavior. We’re using the Moderation API to warn or block certain types of unsafe content, but we expect it to have some false negatives and positives for now. We’re eager to collect user feedback to aid our ongoing work to improve this system.