Thought Leadership

A Dummy’s Guide to GENERATIVE AI

The recent spate of announcements by tech titans such as Microsoft, Google, Apple, OpenAI, NVidea, et al, has started a serious buzz among technology gurus and business leaders. This buzz is a continuation of the overarching headlines emanating out of Davos 2024, the consensus there that AI and Generative AI (this was specifically mentioned) as the means to, firstly, transform society and, secondly, to achieve greater revenues. While computer science graduates are revelling in the availability of new AI technologies, most of us are not sure what the buzz is about. Sure, we are all using ChatGPT, but how is this going to transform our lives? This article attempts to unpack the technologies associated with AI, especially that of Generative AI that is at the heart of the buzz.

What is Generative AI?

To answer this, we need to go one step back and properly understand Artificial Intelligence (AI). Broadly speaking AI can be equated to a discipline. Think of science as a discipline; within science we get chemistry, physics, microbiology, etc; in the same way AI is a broad discipline, and within AI there are several subsets such as ML (Machine Learning), algorithms to perform specific tasks, Expert Systems (mimicking human expertise in specific topics to support decision making), Generative AI, etc.

Generative AI (Gen AI) has been making significant strides, especially since December 2022. On 30 November 2022, OpenAI released ChatGPT, which reached 100 million users in just 2 months, compared to 78 months for Google Translate, 20 months for Instagram, and 9 months for TikTok. Generative AI is a major advancement, referring to AI that creates new content, such as text, images, language translations, audio, music, and code. While currently focused on these outputs, Gen AI’s potential is vast and could eventually encompass areas like urban planning, therapies, virtual sermons, and esoteric sciences. Generative AI is essentially a subset or specialized form of AI, akin to how chemistry is a subset of science. In AI terminology, these systems are called “models,” with ChatGPT being one example.

Unpacking GPT

The term “Chat” in ChatGPT signifies a conversation, whether through text or voice, between the user and the system. “GPT” stands for Generative Pre-trained Transformer. “Generative” refers to the AI’s ability to create original content, while “Pre-trained” highlights a core concept in AI where models are trained on vast datasets to perform specific tasks, like translation between languages. For instance, a translation model can’t provide insights like a Ferrari’s speed, but it can explain linguistic origins, such as Ferrari deriving from the Italian word for “blacksmith”. This capability is honed through deep learning, where the model learns associations and context from extensive data. The training process involves predicting the next word in a sequence based on prior words, which can sometimes lead to errors like “hallucinations” – unexpected outputs such as “the pillow is a tasty rice dish”. This demonstrates how AI learns and operates within defined parameters without human intuition.

The key here is that the model has to be trained on, firstly, vast amounts of data, and, secondly, with meticulous attention. And this leads us to another common phrase or jargon used in the AI world – Large Language Models or LLMs. In fact, Chat GPT is a Large Language Model! If we have to define LLM, it could be defined as a next word prediction tool. From where do the developers of LLMs get data to carry out the Pre-training? They download an entire corpus of data mainly from websites such as Wikipedia, Quora, public social media, Github, Reddit, etc. it is moot to mention here that it cost OpenAI $1b (yup, one billion USD) to create and train Chat GPT – they were funded by Elon Musk, Microsoft, etc. Perhaps, that is why it not an open-source model!!

Let’s now unpack the ‘T’ of ‘GPT’. This refers to Transformer. This is the ‘brain’ of Gen AI; Transformers may be defined as machine learning models; it is a neural network that contains 2 important components: an Encoder and a Decoder. Here’s a simple question that could be posted to ChatGPT: “What is a ciabatta loaf?”. Upon typing the question in ChatGPT, the question goes into the Transformer’s Encoder. The 2 operative words in the question are ‘ciabatta’ and ‘loaf’. The word ‘Ciabatta’ has 2 possible contexts – footwear and Italian sour dough bread (Ciabatta means slippers; since the bread is shaped like a slipper, it is called ‘ciabatta’).

Ciabatta Loaf

In the context of “loaf,” ChatGPT, a Pre-Trained model, would prioritize food items over other meanings. For instance, given “loaf,” it would likely choose “bread” over “footwear,” recognizing “ciabatta bread” as a specific example. The model processes words sequentially and can predict associations like identifying ciabatta as an Italian sourdough bread. However, ChatGPT’s responses aren’t always flawless, as accuracy depends on its training and fine-tuning. Despite occasional errors, its answers are often remarkably precise, reflecting meticulous development involving techniques like “attention,” which enhances its ability to focus on relevant details in data processing.  


Did you know that Gen AI has been in use well before the advent of ChatGPT? In 2006 Google Translate was the first Gen AI tool available to the public; If you fed in, for example, “Directeur des Ventes” and asked Google Translate to translate the French into English, it would return “Sales Manager”. (By the way, Transformers was first used by Google). And then in 2011 we were mesmerised by SIRI which was such a popular ‘toy’ initially among iPhone users. Amazon’s Alexa followed, together with chatbots and virtual assistants that became a ubiquitous feature of our lives – these are all GenAI models. As can be seen, we’ve been using Gen AI for a while, however no one told us that these ‘things’ were Generative AI models!