What does “GPT” even mean anyway?

Trying to understand AI can feel like trying to swim your way out of a bowl of confusing, high-tech alphabet soup. Let’s break down one of the most commonly used Acronyms in discussions about AI.

So you’ve heard of ChatGPT. Probably even used it a good bit if you’re reading this. But if I asked you right now, street interview style, what does the GPT stand for, would you have any idea? (Get Prompt Text? Genius Personally Trained?? Just some made up letters that look official?)

Absolutely no worries if you don’t know the answer - because that’s exactly what we’re going to learn today. 

What “GPT” means (in plain English)

  • Generative: it creates text (and, in newer models, can handle images/audio).

  • Pre-trained: before you ever use it, it’s trained on a large mix of internet and licensed data to learn patterns in language.

  • Transformer: the neural-network design that lets it weigh which words in your prompt matter most (basically it can look at all of them and decide which ones are important holistically instead of going one by one through the inputs and treating them equally). This allows it to have a much better “understanding” of what you are asking (and as a result, give you a much better output).

What GPT actually does

  • At its core, GPT is a “next-word predictor”: given the words you’ve typed, it guesses the most likely next token (piece of text), one step at a time. That simple loop—done at huge scale—is what produces fluent answers

At this point, you might be thinking - wait! That sounds familiar! Are you talking about an LLM (Large Language Model)?

To which I’ll say - essentially, yes. Gold star!

(And if you just said “what?? another weird acronym?? you can read the lowdown on LLMs here).

LLM is the generic class of language-generating AI models; GPT is OpenAI’s particular line of those models built on the “transformer” architecture.

To put it a different way - All GPTs are LLMs, but not all LLMs are GPTs.

Or - GPT is the result of OpenAI taking an LLM and adding their own special sauce.

Anyways! Here are some other examples of LLM names you might hear, and which company they are associated to:

  • GPT-4o (OpenAI)

  • Claude 3.5 (Anthropic)

  • Gemini 1.5 (Google)

  • Llama 3.1 (Meta)

  • Mistral Large (Mistral)

  • Grok-2 (xAI)

Each of these refer to a specific company’s take on a LLM - one they have trained to their own standards/parameters/objectives. 

When referring to AI in this form, you can say LLM (vendor neutral) or the specific model that you’re using if you want to be more specific.

Hope that helped! We didn’t get too deep into the nitty-gritty, but this overview should give you a better feel for the AI landscape for applications we most commonly use (and what the heck people are talking about when they mention them).

I’d encourage you to play around with some of these different LLMs to learn a bit more about which ones best suit your needs - they all have their strengths and weaknesses (will go over these in a future newsletter).

Did you know what GPT stood for? Let me know!