Large language models decide what to do before they write reasoning — not after
What happened
Researchers found that AI reasoning models encode their final decision in internal patterns before they start writing out their thinking process. This means the deliberation you see written out is often post-hoc rationalization, not the actual decision-making happening in real time.
Why it matters
This reveals something fundamental about how large language models actually work versus how they appear to work — the chain-of-thought reasoning we see isn't always genuine problem-solving, but sometimes a plausible narrative constructed after a decision is already locked in. If reasoning models are genuinely thinking step-by-step only some of the time, that changes how much trust we should place in their explanations, and what kinds of tasks they're actually suited for. It also explains why you can sometimes steer these models toward different answers by manipulating their internal activations — the reasoning adapts to match a decision that was already encoded underneath.
The signal
Watch whether follow-up work shows this pattern varies by task type — does it happen more on simple classification versus genuinely hard reasoning problems — and whether it holds across different model architectures and sizes.