Artificial Intelligence (AI) and Machine Learning (ML) have overtaken the world. You can find these technologies anywhere, from speech recognition to self-driving cars! In this article, we'll focus on one of the most impressive breakthroughs in the field yet: GPT3. We will explain the GPT-3 model, its uses, and its limitations. Are you ready?
GPT3 is a third-generation, Machine-Learning system, and Text-Generating Neural Network created by OpenAI. The Generative Pre-trained Transformer 3 uses algorithms based on 45TB of data. As a result, GPT-3 uses 175 billion ML parameters to produce human-like text. Its model is impressive because it's ten times larger than any other model created before. GPT-3 is a considerable step up from the previous version, GPT-2, which used "only" 1.5 billion parameters.
We now know what the GPT3 model is. But what does "Generative Pre-trained Transformer" mean? Let's review each term that makes up this Machine-Learning system.
In Machine Learning, there are two main models: discriminative and generative. The difference between them is how these classify tasks. Discriminative (or conditional) models learn boundaries between classes in a dataset. As it focuses on class differences, it can't create new data points. Meanwhile, Generative models' focuses go beyond differences. Contrariwise, it learns from the data fed to it. Hence, it can create new data from what it receives.
The fact that GPT3 is Pre-trained means it has previous training. In turn, it can create specific parameters for different tasks. Like humans, Pre-trained models don't need to learn everything from scratch. Furthermore, it can use old knowledge and apply it to new duties.
Lastly, a Transformer is a type of Neural Network released in 2017. These tose to solve problems related to machine translation. Since its launch, transformers' evolution enabled the extension of their uses. They've also expanded beyond Language Processing into tasks such as time series forecasting.
Back in 2018, OpenAI launched its first Generative Pre-Trained Transformer. In the beginning, it had around 117 million parameters. Its most remarkable breakthrough was its ability to carry out zero-shot performances. Yet, GPT1 has limitations, so OpenAi moved on to the next stage. You can access GPT1's original documentation in a .pdf format.
The second GPT, released in 2019, had a larger dataset, with x10 parameters (1.5 billion) and data. The main difference between GPT1 and GPT2 is that the latter can multitask. GPT2 was able to translate text, summarize passages and answer questions. While it also can create text, its results often were repetitive and non-sensical. Hence, GPT3 was the obvious next phase. As we'll see in this article, it brought significant improvements.
GPT3 uses its sample-fed data to estimate the likeliness of a word appearing in a text. To do so, it also considers other words within the text to understand their connections. Given the vast number of parameters, GPT3 can meta-learn. Hence, when given a single example, the system can perform tasks without training. Currently, GPT-3 works online, and it's free to use. Also, it has its API and a GPT3 demo page to put the tool to test.
OpenAI used almost all available internet data to pre-train GPT-3, with four approaches:
GPT-3 receives fine-tuning when provided with a vast dataset for unsupervised learning. Later, it received adaptation for several tasks with supervised learning and smaller datasets. You can learn more about fine-tuning processes from OpenAI's page.
This type of learning entails providing the model with several examples. Also called low-shot, it's about how to complete a specific task. It enables GTP-3 to intuit intended tasks to perform and create a possible outcome.
The one-shot learning model is like the few-shot one. The only difference is that there's only one example given.
In this case, there are no examples; the only thing provided is the task description.
Above, we saw the different training approaches GPT3 received. Now it's time to look at real-life examples of the GPT3 model in action!
This AI pair programmer is the outcome of GitHub and OpenAI's joint work. It uses OpenAI's Codex to suggest lines or functions inside editors like JetBrains and VSC. Github Copilot achieved excellent outcomes with Python, TypeScript, and Java.
In this case, GPT-3 focused on creating realistic chatbots to talk to online. Project December, created by Jason Rohrer, is available for $10. You can feed the bot algorithm text to train it and learn from your input as you talk to it.
Andrew Mayne built AI Writer for people to interact with historical figures by email. Mayne has also used OpenAI to create simple versions of popular games such as Wordle and Zelda.
The well-known newspaper used GPT-3 to write an essay about humans and robots coming to peace.
As you can see, the GPT3 model has proven its value. But now we'd like to focus on its most impressive product yet. Below, we'll review Dall-E, a system created from GPT3.
Dall-E's project saw the light in January 2021. The project produces images with only natural language text captions. The system has a 12-billion parameter version of GPT-3 trained for this purpose. For it, it constantly receives millions of images tied to its captions.
In April 2022, OpenAI announced the release of Dall-E 2.
The upgrade relied on its art realism and ability to understand prompts. In comparison, Dall-E has four times the resolution of its previous version. Further, it allows other enhancements, like adding or removing elements from existing images. It also takes shadows, reflections, and textures into consideration. Hence, its results are impressive.
Today, Dall-E has a realistic approach to users' prompts. It also recognizes famous art styles and color palettes. You can also upload pictures to its server, erase backgrounds and choose the outcomes' style. Another fun possibility is its "surprise me" feature. Beyond giving fantastic results, it also helps understand its algorithm's logic.
We can agree on the fact that GPT3 shows impressive potential. Also, it's an enormous step forward in Artificial Intelligence. But like every new tool, it has its shortcomings.
One of the issues GPT3 faces it's an ongoing attempt to remove bias from the system. The biases found in GPT-3 include gender, race, and religion. The GPT-3 model is also prone to spread fake news as it can produce human-like articles. Also, there is much debate about GPT3's carbon footprint. The resources to train AIs are not only enormous but are also ever-growing. Hence, it's a troubling system with the environment at a social peak.
Let's take a look at the most common questions about this model.
As mentioned, GPT3 is a product of OpenAI. This AI research and dev laboratory was —among others—founded by Elon Musk and Sam Altman in 2015. Its final goal is to create artificial intelligence that benefits humanity. In 2016, OpenAI developed the OpenAI Gym. This space is "a toolkit for developing and comparing Reinforcement Learning (RL) algorithms." It also encompasses multimodal neurons in AI networks and Dall-E from 2021.
GPT3 only needs textual-interactional demonstration to work. It can perform tasks like:
● Translate common languages.
● Write news articles with a title.
● Write stories, poems, and music.
● Write technical documentation.
● Write software code.
● Create PR material.
● Create SQL queries.
Yes, GPT3 can create code in several programming languages. Yet, this feature does not mean developers will get replaced, though. GPT-3 and AI will most likely take over mundane tasks. For example, by helping cut bottlenecks in software production. Hence, devs and engineers will be able to focus on more creative tasks.
GPT3 is an exciting Machine-Learning system. From what we discussed above, the model has a lot of potentials. Yet, it still needs some adjusting before it is optimal for widespread use. We look forward to the next stage and the handling of its shortcomings! Are you excited to see more of GPT3 in action? What would you use it for?