The recent spate of announcements by tech titans such as Microsoft, Google, Apple,OpenAI, NVidea, et al, has started a serious buzz among technology gurus andbusiness leaders. This buzz is a continuation of the overarching headlines emanatingout of Davos 2024, the consensus there that AI and Generative AI (this wasspecifically mentioned) as the means to, firstly, transform society and, secondly, toachieve greater revenues. While computer science graduates are revelling in theavailability of new AI technologies, most of us are not sure what the buzz is about.Sure, we are all using ChatGPT, but how is this going to transform our lives? Thisarticle attempts to unpack the technologies associated with AI, especially that ofGenerative AI that is at the heart of the buzz.
What is Generative AI?
To answer this, we need to go one step back and properly understand ArtificialIntelligence (AI). Broadly speaking AI can be equated to a discipline. Think of scienceas a discipline; within science we get chemistry, physics, microbiology, etc; in thesame way AI is a broad discipline, and within AI there are several subsets such as ML(Machine Learning), algorithms to perform specific tasks, Expert Systems (mimickinghuman expertise in specific topics to support decision making), Generative AI, etc.Generative AI (Gen AI) has been making significant strides, especially sinceDecember 2022. On 30 November 2022, OpenAI released ChatGPT, which reached100 million users in just 2 months, compared to 78 months for Google Translate, 20months for Instagram, and 9 months for TikTok. Generative AI is a majoradvancement, referring to AI that creates new content, such as text, images,language translations, audio, music, and code. While currently focused on theseoutputs, Gen AI’s potential is vast and could eventually encompass areas like urbanplanning, therapies, virtual sermons, and esoteric sciences. Generative AI isessentially a subset or specialized form of AI, akin to how chemistry is a subset ofscience. In AI terminology, these systems are called “models,” with ChatGPT beingone example.
Unpacking GPT
The term “Chat” in ChatGPT signifies a conversation, whether through text or voice,between the user and the system. “GPT” stands for Generative Pre-trainedTransformer. “Generative” refers to the AI’s ability to create original content, while“Pre-trained” highlights a core concept in AI where models are trained on vastdatasets to perform specific tasks, like translation between languages. For instance,a translation model can’t provide insights like a Ferrari’s speed, but it can explainlinguistic origins, such as Ferrari deriving from the Italian word for “blacksmith”. Thiscapability is honed through deep learning, where the model learns associations and context from extensive data. The training process involves predicting the next wordin a sequence based on prior words, which can sometimes lead to errors like“hallucinations” – unexpected outputs such as “the pillow is a tasty rice dish”. Thisdemonstrates how AI learns and operates within defined parameters without humanintuition.
The key here is that the model has to be trained on, firstly, vast amounts of data,and, secondly, with meticulous attention. And this leads us to another commonphrase or jargon used in the AI world – Large Language Models or LLMs. In fact, ChatGPT is a Large Language Model! If we have to define LLM, it could be defined as anext word prediction tool. From where do the developers of LLMs get data to carryout the Pre-training? They download an entire corpus of data mainly from websitessuch as Wikipedia, Quora, public social media, Github, Reddit, etc. it is moot tomention here that it cost OpenAI $1b (yup, one billion USD) to create and train ChatGPT – they were funded by Elon Musk, Microsoft, etc. Perhaps, that is why it not anopen-source model!!
Let’s now unpack the ‘T’ of ‘GPT’. This refers to Transformer. This is the ‘brain’ ofGen AI; Transformers may be defined as machine learning models; it is a neuralnetwork that contains 2 important components: an Encoder and a Decoder. Here’s asimple question that could be posted to ChatGPT: “What is a ciabatta loaf?”. Upontyping the question in ChatGPT, the question goes into the Transformer’s Encoder.The 2 operative words in the question are ‘ciabatta’ and ‘loaf’. The word ‘Ciabatta’has 2 possible contexts – footwear and Italian sour dough bread (Ciabatta meansslippers; since the bread is shaped like a slipper, it is called ‘ciabatta’).In the context of “loaf,” ChatGPT, a Pre-Trained model, would prioritize food itemsover other meanings. For instance, given “loaf,” it would likely choose “bread” over“footwear,” recognizing “ciabatta bread” as a specific example. The model processeswords sequentially and can predict associations like identifying ciabatta as an Italiansourdough bread. However, ChatGPT’s responses aren’t always flawless, as accuracydepends on its training and fine-tuning. Despite occasional errors, its answers areoften remarkably precise, reflecting meticulous development involving techniqueslike “attention,” which enhances its ability to focus on relevant details in dataprocessing.
Did you know that Gen AI has been in use well before the advent of ChatGPT? In2006 Google Translate was the first Gen AI tool available to the public; If you fed in,for example, “Directeur des Ventes” and asked Google Translate to translate theFrench into English, it would return “Sales Manager”. (By the way, Transformers wasfirst used by Google). And then in 2011 we were mesmerised by SIRI which was sucha popular ‘toy’ initially among iPhone users. Amazon’s Alexa followed, together withchatbots and virtual assistants that became a ubiquitous feature of our lives – theseare all GenAI models. As can be seen, we’ve been using Gen AI for a while, howeverno one told us that these ‘things’ were Generative AI models!