How LLMs Actually Work: The Mental Model Every AI Developer Needs
2025-12-11
π€ 1. Everything is tokens
LLMs don't see sentences. They see token IDs.
This is why:
- context length matters
- long prompts cost more
- models lose track of earlier text
π§ 2. Attention
Attention lets each token inspect every other token and decide relevance.
Gives you:
- reasoning
- relationships
- instruction following
Also causes:
- hallucinations (wrong patterns reinforced)
πͺ 3. Transformer layers
Layers refine meaning:
- lower β syntax
- middle β facts
- upper β reasoning
π§© 4. Why LLMs hallucinate
Because they predict, they donβt verify.
Agents fix this with:
- tools
- retrieval
- planning loops
π¦ 5. Why small models often win
Tools + retrieval > raw model size.