How AI works · the slower, thoughtful kind
Reasoning models
30-second gist~30s read
A reasoning model is an AI built to think step-by-step before answering. OpenAI's o-series, Anthropic's "extended thinking" mode, and Google's Gemini "thinking" mode are all in this family.
They take longer (sometimes many seconds), cost more per query, and are noticeably better at maths, code, planning, and multi-step problems. For casual chat they're overkill.
If you want more
What's actually happening?
Under the hood, the model produces a long internal "thinking" trace — sometimes thousands of tokens of self-talk you never see — before writing its answer. The thinking trace lets it back-track, double-check itself, and explore more than one approach. The visible answer is then much shorter and more confident than the raw thought.
It's not magic and it's not perfect. The thinking can still be wrong. But on hard problems the difference from a regular chatbot is real and large.
When you'd actually want one
- Hard maths or anything with multi-step arithmetic.
- Debugging code, especially anything with state.
- Planning a project end-to-end.
- Comparing options across several criteria.
- Anything where you want to ask "now check yourself, what could be wrong?"
For everyday writing and quick questions, the regular chat models are faster and cheaper.