How AI works · the engine
Large language model
30-second gist~30s read
A large language model (LLM) is the engine behind ChatGPT, Claude, Gemini, and Copilot. It works by reading enormous amounts of text and learning to predict the next word in a sentence. Scaled up enough, that simple trick produces answers that look like reasoning.
The "large" matters. A small one writes nonsense. A big one — billions of patterns deep — writes essays, code, and email. It still doesn't know anything. It's predicting plausible word-by-word output.
If you want more
How does it actually work?
Imagine the world's most-read autocomplete. The model is shown billions of pages of text and learns the statistical patterns: which word is likely to follow which, given everything before. After training, you give it the start of a sentence and it generates word after word — each one chosen because it's the most likely continuation.
The model itself is a giant grid of numbers (called parameters or weights — think of them as the dials inside the model). The biggest models have hundreds of billions of them. The numbers don't store facts directly. They store the patterns that produce plausible output.
What can it do, what can't it?
Good at: writing, summarising, translating, brainstorming, explaining, coding, role-play. Anything where the answer is made of language.
Bad at: arithmetic on big numbers, real-time facts (what's the weather right now?), citations, anything needing a guarantee of truth. It will confidently invent when it doesn't know — see hallucination.
Scale check
GPT-4 was reportedly trained on around 13 trillion tokens (think roughly: trillions of words) at a cost analysts have estimated above US$80 million — though OpenAI has never confirmed either figure. ChatGPT reached 100 million users two months after launch — at the time, the fastest-growing consumer app ever (Threads broke that record in July 2023).