How AI works · what it can hold in mind

Context window

#beginner

30-second gist~30s read

The context window is everything an AI can "see" at once: your current message, the whole chat history, plus any documents you've pasted in. It has a hard limit. When you go past it, the earliest parts of the conversation quietly drop off.

This is why your fifty-message chat about a holiday plan eventually "forgets" the budget you mentioned in message three. The AI isn't being lazy. It's outside the window.

If you want more

How big is the window in 2026?~30s

It varies a lot by model. The biggest commercial models now hold roughly a million tokens — the equivalent of several full-length novels. Anthropic's top Claude models (Opus, Sonnet) and Google's Gemini 2.5 Pro all sit around this mark in early 2026. OpenAI's GPT-5 is in the same ballpark; the older GPT-4o capped at 128,000 tokens.

Older or smaller models (including some you'll meet on free tiers) cap out at 8,000 to 200,000 tokens — much less. If you've ever pasted a long PDF and the AI suddenly forgot the first half, that's what hit you.

What this looks like in practice~30s

Long chats slowly lose the original goal you stated.
Document Q&A misses sections far from where the question is being asked.
Copying a whole codebase in often loses files in the middle.
The fix is usually: start a fresh chat and re-paste only what matters.